When I tell people that I majored in oceanography, I get responses that range anywhere from “That sounds amazing, and I need to hear more!” to “I don’t know what that is, so I’ll just remember that you like ocean stuff.” These days, I tell people that I’m studying ocean data science, but not many people know what that means, so I thought I would write about it here. My goal here is to explain why data science is necessary for the grand exploration of the world’s oceans as the start of my own personal blog series about ocean data science.
I want to start by answering one glaringly obvious question: what is oceanography? Many people think that oceanographers are like marine biologists who study marine organisms and ecology. In addition to studying the organisms that live in the oceans, oceanographers also study the oceans themselves. And I mean yes, literally, we study all of the seawater in all of the planet’s oceans, seas, estuaries, and more (we even work with astronomers to study oceans on other worlds!).
Oceanography lives at the intersection of multiple scientific disciplines. Many oceanographers focus on one discipline for most of their research. There are four main disciplines in oceanography — biological, chemical, geological, and physical—which I’ll dive into below (pun intended). These disciplines sometimes hog the spotlight, but there are also many marine technicians and engineers who devote their lives to building oceanographic sensors that can collect data in some of the most extreme environments on our planet, like on the seafloor! They make all of our research possible.
Many oceanographers will spend their career studying one of the following:
- the biology, life cycles, and ecology of marine organisms, especially the diverse group of microorganisms called plankton that feed many of the ocean’s inhabitants
- the chemical composition of seawater and the global-scale biogeochemical cycles of its components (such as carbon, nitrogen, and sulfur)
- the geology and formation of the seafloor, the geochemistry of hydrothermal circulation, and tectonic plate activity
- the physics of ocean waves, currents, tides, gyres, eddies, air-sea interactions, deepwater formation, and underwater acoustics
In addition, modern research includes the study of many anthropogenic issues, such as ocean warming, ocean acidification, plastic pollution, overfishing, and oil spills. The field of oceanography expands quite far beyond this short list, and I will be happy to write about these topics in the future!
From here, we can see that there are many aspects of our planet’s oceans that we want to learn more about, but how do we measure any of these things? The short answer is, lots of ways!
Satellites are great for measuring large regions of the sea surface. There are currently several orbiting Earth that can get us snapshots of the entire planet every few days. Satellites measure chlorophyll levels (a metric for estimating plankton abundance), salinity, temperature, and sea surface height, but they struggle to see through cloud cover and are limited to the top few centimeters of the oceans. Fleets of remotely operated vehicles (ROVs) can gather data in areas that are difficult, dangerous, or impossible to navigate by ship. To get measurements below the surface, various free-floating sensors have been released at sea to drift along currents and send data to land via satellite transmission. Many of these floating sensor packages are even capable of changing their buoyancy and can take measurements as they move up or down the water column. One of the most well-known programs in oceanography is the Argo program, which has been deploying its drifting instruments for 20 years!
There are also sensor packages placed on the seafloor that passively measure seawater components, acoustics from marine organisms and passing ships, and seismic activity from submarine volcanoes. With the implementation of fiber optic cables, these seafloor packages can stream data back to servers on land with breakneck speed, 24/7. Of course, countless oceanographers still collect data the old-fashioned way: by hand, on ships!
The image below shows all the various floats, buoys, and stations deployed and active as of February 2021 — it turns out the ocean is quite populated with our technology!
After seeing this map, you might be thinking, “Wow, all of those sensors must be collecting so much data!” And you’d be right! In fact, we’re currently collecting so much data that we can’t even process and analyze all of it with the current systems we have in place. This is becoming more and more true with every additional float and vehicle we deploy. Fortunately, many folks are already tackling this problem by working towards innovative solutions in data science to allow researchers to make full use of their ever-expanding datasets.
The above is only one way that data science and programming are useful to oceanographers, but, in reality, every research study requires data collection and statistical analysis, which means every oceanographer needs to have some analytical skills. Additionally, as data collection techniques become more efficient and inclusive, the field as a whole is shifting to be more quantitative. This means that intuitive, open-source programming languages like Python will see more use in marine research.
This is where my passion for ocean data science was born and why I’m pursuing data science for environmental good. Just like oceanography, many other disciplines in the environmental sciences are becoming more quantitative. Developing new techniques to process, integrate, and analyze enormous datasets will be vital for furthering our understanding of the amazing planet we live on. In this world, we’re fortunate enough to go on exciting voyages at sea and watch the teeming life at hydrothermal vents through camera feeds. Still, a lot more computer work needs to be done to get the most out of our growing collection of oceanographic datasets. I’ve got my work cut out for me, and I’m looking forward to bringing the results of it to you here.