Organic Agriculture vs. Genetically Modified Organisms (GMO): The “Baxter vs. Marsh” Case Study

Ivan Del ValleJust now·13 min readThe “Baxter vs. Marsh” Case StudyThe “Baxter vs. Marsh” Case StudyOrganic Agriculture vs. Genetically Modified Organisms (GMO)Ivan Del ValleAbstractThe case study “Baxter vs. Marsh” is centered around organic agriculture versus genetically modified organisms (GMOs). The events took place in Kojonup, 160 miles south east of the city of Perth in Western Australia. The Baxter and Marsh farms were adjacent to each other, with the first one focusing on organic farming while the second one on GMO crops. Just before Baxter’s first crop of genetically modified canola was harvested, the standing crop was sprayed with herbicide, and rather than being direct harvested, the crop was swathed and left in the field (exposed to the elements) for collection in two or three weeks. GMO swathes and seeds were subsequently found over much of Marsh’s farm, losing its organic certification and triggering a legal action that “ran over three weeks, and was then dismissed in its entirety; with no nuisance, negligence, injunction, or damages” (Paull, 2014). This paper will explore the problem, causes, possible alternatives, and a recommended action plan.Keywords: Organic Farming, GMO, Genetically Modified Organisms, Baxter, MarshThe “Baxter vs. Marsh” Case StudyThe introduction of genetically modified seeds and crops has given rise to substantial public debate, with proponents and opponents taking hard-lined positions over the advantages and disadvantages of their introduction. Food coming from genetically modified organisms (GMOs) has met considerable rejection among European Union (EU) consumers. The EU import ban on GM food has triggered a great deal of controversy and has been partly replaced by a mandatory labeling scheme. The words health, ecology, and environment are usually associated with debates related to GMOs versus organic farming.“The anti-GMO lobby accuses proponents of this technology of pushing the introduction of GMOs into agriculture without adequately considering health and environmental risks. The pro-GMO camp charges its opponents with blowing potential risks out of proportion in order to manipulate public opinion against this new technology” (Marris, 2001). Four crops dominate most of the world’s GMO farming, being them soy, corn, cotton, and canola (CBAN, 2015). Five countries account for 90 percent of the production-United States, Brazil, Argentina, India and Canada (CBAN, 2015). In contrast, “Australia is the country with the most organic agricultural land in the world-with 97 percent of the farmland being extensive grazing areas” (Willer & Lernoud, 2019).Organic farming is much more widely distributed among the world, with more than 162 countries reporting farming statistics. As of 2015, 40 percent of the world’s certified organic agriculture came from Australia, followed by Argentina, United States, China, and Spain (Paull, 2015). This case study is centered about a GMO intrusion/contamination event that took place in Kojonup, a rural area with approximately 2,100 habitants located in the south west corner of Australia.The ProblemSteve Marsh grew organic oats, wheat, rye, spelt and sheep just outside of the town of Kojonup. His farm was certified as organic since 2006. Marsh’s neighbor was Michael Baxter, who grew cereal crops, sheep, and canola. Their farms share a common boundary of about 2.2 miles, with Baxter’s one being sized at 2,223 acres, and Marsh’s at 1,178. In 2010, Baxter planted GMO canola in two of his boundary paddocks, and non-GMO canola in the middle (saying he ran out of GMO seed). Marsh warned Baxter that such action could jeopardize his organic certification, and Baxter proceeded anyway. The result was that GMO canola blew across the organic farm (swathes were identified), causing 70 percent of Marsh’s farm decertified as organic in 2010.GMO canola was approved in Australia since 2003, and “the need for maintaining a segregation of GMO and non-GMO agriculture was identified as an issue from the outset. However, no precautions, protocols or penalties for breach were legislated” (Paull, 2015), triggering moratoriums that were lifted in 2010. In April of 2012, Marsh sued Baxter, and the case was first heard in February of 2014 for 12 days. The case had four primary elements: nuisance, negligence, injunction, and damages. First, proving that the events have been a nuisance to Marsh and that Baxter was negligent. Second, the plaintiff looked for an injunction-with the court ordering Baxter’s behavior to be different moving forward, and for Baxter to pay Marsh the losses incurred as a result of the organic decertification.Literature ReviewThe “Baxter vs. Marsh” case study has become of the most representative examples of legal confrontation between the groups representing organic goods and genetically modified organisms (Paull, 2014). The lack of sufficient research and/or regulations to guarantee the safety of GMOs-in addition to the aggressive tactics of the companies holding their patents, have diminished their credibility and confidence, failing to manipulate the public opinion in favor of this new technology. As per Marris (2001), uncertainty should not be taken into account in decisions in which it would be impossible to anticipate all risks in the long term.Four crops-soy, corn, cotton, and canola, dominate most of the world’s GMO farming (CBAN, 2015). Forty percent of the world’s certified organic agriculture comes from Australia, followed by Argentina, United States, China, and Spain (Paull, 2015). In addition, Australia is the country with the most organic agricultural land in the world (Willer & Lernoud, 2019), so it was just a matter of time for the Continent to take center stage in the discussions of the topic.What started as a feud between two farmers for damages estimated at 60,000 Euros, eventually escalated to legal costs of 570,000 Euros that were surprisingly awarded to Baxter (Paull, 2015). In an unexpected twist, the courts suggested that the organic decertification of some of Marsh’s paddocks due to contamination were due to unrealistic expectations of the certifying agency (Kym, 2015).While South Australia has a moratorium on the commercial cultivation of GMO crops-scheduled until 2025 (Anderson, 2019), both sides should allocate time and resources in performing a stakeholder’s analysis on the potential threats weighted against the possibility of mutual benefits driven by a cooperative approach (Watt, 2014).Root Cause AnalysisThe method used by Baxter to harvest the GMO canola quickly took center stage in the case. He had grown non-GMO canola for a decade, and he had always beheaded the crop, harvesting the seeds immediately as part of the process. For the GMO canola, Baxter changed the process, swathing the crop-cutting the stalks, applying herbicide, windrowing the cut stalks, and leaving them for 3 weeks before collection, which explains how the GMO canola material (stalks, swathes, seed pods, and seeds) came to be blown across the Marsh farm during those 3 weeks.The judge accepted that 245 GMO canola swathes were blown onto the Marsh’s organic farm- “intruding” 0.75 miles into the farm. The judge ruled it as an “intrusion” rather than “contamination”, which was not in alignment with the organic certifier definition. He ruled that there was no physical damage to the Marsh farm and suggested the organic farmer to take up the decertification with his certifier rather than Baxter. As it relates to the negligence charge, the judge stated that Baxter had not used swathing as a harvest method before, so he could not have foreseen the blowing of the GMO material into Marsh’s farm.The injunction charge was dropped as well, and what started as a buffer zone conversation between future GMO crops and Marsh’s farm was replaced with a failed request for a ban on swathing as a canola harvest method. The fourth element of the plaintiff was around the damages. “Damages to Marsh’s enterprise were agreed between the parties at 60,000 Euros, mostly due to the loss of the organic premiums on farm outputs due to the loss of certification, but having lost the case, legal costs of 570,000 Euros were awarded to Baxter” (Paull, 2015).To summarize, the judge declared no nuisance, no negligence, no injunction, and in practical terms, no damages. “Liability for these costs has been appealed. At the time of writing, Marsh has not paid this to Baxter, and it appears that Baxter has not paid this to his lawyers. The results of appeals against the judgement, and against the awarding of costs are awaited. The legal avenue in this case is the High Court of Australia.” (Paull, 2015).Restating the Problem & Causes“It is impossible to anticipate all risks-especially in the long term, but uncertainty is not admitted and not taken into account in the decision-making process” (Marris, 2001). The farms of Baxter and Marsh adjoin one another with a road separating them. In 2010, Baxter planted GMO canola next to Marsh’s organic one after its commercial production was authorized by the government of Australia.In November of 2010, Baxter swathed his GMO canola and allowed it to dry for a period of three weeks before harvesting. During that time, the wind blew approximately 245 swatches with attached seed pods-containing viable canola seeds, into several sections of the Marsh farm. On December of the same year, Marsh was notified by his organic certifying agency that several paddocks of his farm were decertified-as they did not meet the minimum threshold requirements, causing him multi-year economic losses that he tried to recover by suing Baxter for being forced to sell the decertified crops at conventional prices rather than the organic premium prices.Marsh sued on charges related to tort, nuisance, and negligence, and also petitioned an injunction prohibiting Baxter from planting GMO canola within certain distance of his farm. The case had a lot of media attention, as Baxter was backed by Monsanto-now owned by Bayer by the time of the release of this paper, the Company that had the patents on the GMO canola. The Court dismissed the three causes of action, not only ruling in favor of Baxter, but also reserving the issue of costs for further legal proceedings.Possible Alternatives“To date there is no satisfactory resolution to the grievances of Marsh, and the outcomes so far provide no assurances to the organics sector that GMO is not a threat to the viability of their agricultural practices and their business model” (Paull, 2015). It is important to mention that during the initial trials, it was shown by scientific evidence that none of the canola crops from Marsh could acquire any genetic traits of GMO canola. As a result, the ~245 swathes found on Marsh’s farm were labeled as an “intrusion” event rather than a “contamination” one.Potential alternatives to stay away from “intrusion” events could include implementing risk mitigation practices to avoid GMO canola swathes unintentionally getting into Marsh’s farm because of nature elements. That kind of event could be minimized or eliminated by voluntarily adhering to the traditional method of canola harvesting (beheading the crop and harvesting the seeds immediately). If voluntary compliance is not an option, let’s not forget that “South Australia currently has a moratorium on the commercial cultivation of GM food crops which is scheduled to continue until 2025” (Anderson, 2019). New local regulations could be deployed to override the existing moratorium and prevent GMO farmers who are in close proximity to organic lands from using harvesting methods that could pose either direct or indirect risks to the organic certification(s).Another major outcome of the court decision was the Judge’s strong comment implying the party to be sued for damages should have been the organic certifier NASAA (The National Association for Sustainable Agriculture in Australia), by having unrealistic organic standards based on a zero-tolerance requirement which was different than most standards from other agencies across the world. “The courts found that the decertification due to contamination was a ‘gross overreaction’ on the part of the agency. The organic farmer was told he should instead sue the organic certification body” (Kym, 2015).A potential alternative to the NASAA zero-tolerance requirement could be having more than one regional organic-certification agency and selecting the one for certification that better align to the realities of the local dynamics and export regulations. Standards should be pursued in alignment with the definition and thresholds determined by the top certifiers in the world. “Doing a cost-benefit analysis that considers liability based on organic certification standards and property damage, not on normative ideas of what a politically appropriate level of contamination is, ought to be included as a fundamental phase in the assessment process” (Kym, 2015).The last potential alternative discussed on this paper consists of taking a totally different approach to the situation by leveraging the common ground between Marsh & Baxter-instead of focusing on the areas of confrontation. Together they could run combined pilots, based on independent research and driven by experts on both sides of the organic versus GMO spectrum, as a way to explore options and agreements on what cops to harvest and where, the distance between them, and potential innovations to the harvest process. This collaborative approach could result into a potential partnership that could not only improve their communication and business operations visibility, but also translate into new marketing strategies in which both could benefit. The potential social impact of GMOs-especially on underdeveloped countries, should not be minimized or ignored under the stigma of being ‘evil’ due to the unfortunate approach the Company holding the patents on this case took to promote and defend their products.Analysis and RecommendationThe following “Pros & Cons” table summarizes the potential alternatives to the Marsh vs. Baxter case study.“Perhaps the importance of good communication is best understood by considering what things would be like in its absence (“Reference for Business.” n.d.). The recommended path forward consists primarily on leveraging the common ground between Marsh & Baxter instead of focusing on the areas of confrontation. Although this could provide the perception of a defeat to organics and validating GMOs as a safe food option that does not require further research, it is not about that, but about balancing innovation with the realities of a world that is changing and need collaborative solutions in the pursue of sustainable solutions not only for our generation, but for the ones to come.As per Watt (2014), the first step in the stakeholder’s management process is to identify the main stakeholders, followed by performing an analysis of the stakeholders’ potential for threat, cooperation, and what strategy to adopt. Under the right mediation, Marsh & Baxter could be able to address their differences in a mature way, leaving behind an unfortunate chapter of legal actions that did not result in anything beneficial for most customers from both sides. The second recommended action consists of the Government approving more than one regional certification agency-with potentially different requirement thresholds. Instead of being seen as a threat against the organic movement, this should be seen as an opportunity to trigger conversations against the different organizations toward the goal of global consistent standards.Recommended Implementation PlanThe high-level recommended implementation plan & steps would look like this:1. During the first two to four weeks, Marsh & Baxter call up their Legal teams to meet and consider dismissing pending legal actions still in progress. Once agreements are reached, they will communicate to the Court a proposal on how they would like to proceed.2. Both teams agree on a mutually-selected entity that will pursue neutral funding and conduct independent research on both sides of the organic and GMO spectrum to discuss and come up-within a six-months period of time, with recommendations such as-but not limited to, what crops to harvest and where, and changes to harvest processes.3. In parallel, the Government of Western Australia should authorize additional organic certification agencies. This will be part of a new comprehensive effort to align and standardize as much as possible the certification requirement thresholds to the dynamics of the real world-while educating as necessary and appropriate the end customers.4. Implementation takes place based on a structured Legal and Project execution management framework. Learnings are documented and discussed through a newly created governance team.5. Mash & Baxter share their learnings on how to facilitate the co-existence of GMO and organic farming through conferences and seminars. Any proceedings from those meetings should be reinvested in more related research.Conclusion“Expanding the definition of risk assessment to include social and economic impacts would likely incorporate some concerns that organic and integrated pest-management farmers have about gene flow contamination, however this will never be enough to avoid all future liability” (Kym, 2015). Focusing on the common ground instead of having a confrontational attitude toward the differences should open the doors to a new era of research, collaboration, and innovation-in which both Mash and Baxter could return to focus and mutually benefit from what they really enjoy doing, which is farming. With additional research, as well though a renewed sense of global citizenship in a world full of unanswered questions, GMOs could represent much more than just a business opportunity or threat, but an agent of change with a social impact that could translate into new solutions to complex issues such as how to feed a world with limited resources and a growing exponential population. There are infinite shades of grey between what some people would consider either white or black. It is up to us as custodians of the resources of the generations to come to be able to put aside our differences to objectively focus on a future full of alternatives within an ethical framework in which each one of us have a voice and the power to influence the whole.ReferencesCBAN. (2015). Where in the world are GM crops and foods? The reality of GM crops in the ground and on our plates. Ottawa, Canadian Biotechnology Action Network (CBAN). Retrieved from, S. (2015). Legally Containing the Uncontainable: Establishing a Liability Scheme for GE contamination in Canadian Agriculture. York University., C. (2001). Public views on GMOs: deconstructing the myths. European Molecular Biology Organization. EMBO reports vol. 2 | no. 7, pp. 545–548., J. (2014). Organic versus GMO farming: Contamination, what contamination? Journal of Organic Systems, 9 (1), pp. 2–4. Retrieved from, J. (2015). The threat of genetically modified organisms (GMOs) to organic agriculture: A case study update. Agriculture & Food, 3, pp. 56–63. for Business. (n.d.). Communication in organizations. http://www.ref-, A. (2014). Project Management. Licensed under the Creative Commons Attribution 4.0 Unported License., H. and Lernoud, J. (2019). The World of Organic Agriculture. Statistics and Emerging Trends 2019. 20th Edition. Research Institute of Organic Agriculture FiBL and IFOAM Organics International, Frick and Bonn. Retrieved from

Read More

Smart Real-Time Data Analytics and AI: Banking & small business lending becomes more intelligent


Things being what they are, how precisely do these innovations apply in fintech? In this article, we will investigate how AI and Real-Time Smart Data analytics increase the value of the fintech sector, represent better client experience, and enhance the small business lending processes.

Read More

Continuous Testing for Machine Learning Systems


Validate the correctness and performance of machine learning systems through the ML product lifecycle.Photo by Tolga Ulkan on UnsplashTesting in the software industry is a well-researched and established area. The good practices which have been learned from the countless number of the failed projects help us to release frequently and have fewer opportunities to see defects in production. Industry common practices like CI, test coverage, and TDD are well adopted and tailored for every single project.However, when we try to borrow the SWE testing philosophy to machine learning areas, we have to solve some unique issues. In this post, we’ll cover some common problems in the testing of ML models (systems) and discuss potential solutions.The ML system here stands for a system (pipeline) that generates prediction (insights) which can be consumed by users. It may include a few machine learning models. For example, an OCR model(system) could include one ML model to detect text region, one ML model to tell which current text region class is ( car plate vs road sign), and one model to recognize the text from a picture.A model is composed of the code (algorithm, pre-process, post-process, etc), data, and infrastructure which facilitates the runtime.The scope of ML system testing, Image by authorDifferent types of testings cover the quality assurance for different components of the system.Data testing: ensuring new data satisfies your assumptions. This testing is needed before we train a model and make predictions. Before training the model, the X and y (labels)Pipeline testing: ensuring your pipeline is set up correctly. It’s like the integration tests in SWE. For the ML system, it may measure consistency (reproducibility) as well.Model evaluation: evaluating how good your ML pipeline is. Depends on the metrics and dataset set you’re using, it could refer to different things.Evaluation on holdout/cross-validation dataset.Evaluation of deployed pipelines and ground truth(continuous evaluation ).Evaluation based on the feedback of system users (the business-related metrics, not a measurable ML proxy)There are a bunch of techniques that can be applied in the process, like slice-based evaluation, MVP(a critical subset of data) groups/samples analysis, ablation study, user subgroup-based experiments (like Beta testing, and A/B testing).Model testing: involves explicit checks for behaviors that we expect our model to follow. This type of testing is not for telling us the accuracy-related indicators, but for preventing us from behaving badly in production. Common test types include, but are not limited to:Invariance(Perturbation) Tests: perturbations of the input without affecting the model’s output.Directional Expectation Tests: to achieve we should have a predictable effect on the model output. For example, if the loss of blood within a surgery goes up, the blood for transfusion should go up as well.Benchmark regression: use predefined samples and accuracy gate to ensure a version of the model won’t introduce insane issues.When do we perform tests? Image by authorSome people may ask why we need to use holdout evaluation and continuous evaluation to measure almost the same metrics in CI and serving time.One reason is that we can’t fully estimate model performance by seeing metrics on a predefined holdout dataset is that the data leakage sometimes is hard to detect than it looks. For example, some features which were expected to exist in the serving time turn out to have high latency to acquire, so our trained models can’t get used to seeing this feature always being empty.Sometimes model evaluation could be very expensive, so a full cycle holdout evaluation is not feasible integrated into CI. In this case, we can define a subset regression evaluation within the CI, and only do the full evaluation before important milestones.Model testing is not a one-off step, instead, it should be a continuously integrated process with the automation setup. Some of the test cases can be performed with the CI process, so each code commit will trigger them and we can guarantee the code/model quality in the main branch of a repo. Others can be conducted in the serving environment, so we won’t be blind to how well our system performs, and we can have relatively sufficient time to fix issues when we have them. Sometimes the ongoing tests within the serving environment can be seen as a part of the monitoring component, and we can integrate with alerting tool to close the loop.Machine learning systems are not straightforward to test, not only because it includes more components (code + data) to verify, but also it has a dynamic nature. Although we didn’t change anything our models can be stale because of the data change (data drift) or the nature of things change (concept drift) over time.Automated testing is an essential component in CI / CD to verify the correctness of pipelines with a low footprint. While manual tests and human-in-the-loop verification are still crucial steps before we say a new ML pipeline is production-ready. After a pipeline has been released to production, continuous monitoring and evaluation can ensure we’re not flying blind. Finally, customer feedback-based tests (.i.e A/B tests) are able to tell us if the problem we are trying to solve is actually getting better.There is no silver bullet in the ML system testing, continuously trying to cover edge cases would help us have fewer opportunities to make mistakes. Hope one day we can figure out a simple metric like code coverage to tell if our system is good enough.

Read More

AI for sustainable Value Chains

Irrigation terrace

Faced with the rise of these new expectations, companies have higher demands from their suppliers and are improving the sustainability of their value chains. However, the required transformation does not stop there, it is now necessary to ensure the quality of the information that circulates in the supply chains and empower each actor with suitable technological tools to manage it. To take full advantage of artificial intelligence (AI) and its capabilities, large amounts of data are necessary. As value chain data is too often scattered, a system integrating supply chain data collection and consolidation is needed.

Read More

Blossom Capital lures Alex Lim from Silicon Valley to join the European tech boom

Alex Lim

Alex Lim, a British-born VC based in the Bay Area who invested in Hopin, UiPath, Discord, and many other unicorns has decided to up sticks and leave Sand Hill Road behind for Blossom Capital in London. Blossom is fast making a name for itself both in Europe and internationally, having invested in breakout hits like Tines, Duffel, and

Read More

How enterprise MLOps supports scaling Data Science

Business meeting

For companies investing in data science, the stakes have never been so high. According to a recent survey from New Vantage Partners (NVP), 62 percent of firms have invested over $50 million in big data and AI, with 17 percent investing more than $500 million. Expectations are just as high as investment levels, with a  survey from Data IQ revealing that a quarter of companies expect data science to increase revenue by 11 percent or more.

Read More

Qualcomm stakes beachhead in Artificial Intelligence with Foxconn Gloria AI Edge Box

Qualcom cloud AI 100

When most folks think of Qualcomm, the first technologies that likely come to mind are the company’s industry-leading mobile platform system-on-chips for smartphones, as well as the company’s end-to-end 5G connectivity solutions. However, whether you consider applications like image recognition, speech input, natural language translation or recommendation engines, modern smartphone platforms typically require a lot of artificial intelligence (AI) processing horsepower as well.

Read More

Miso Robotics names Christopher Kruger new CTO and expands Board Of Advisors to scale and grow in new markets

Miso robotics

Miso Robotics  the startup transforming the foodservice industry with Intelligent Automation  proudly introduces its newly appointed Chief Technology Officer, Christopher Kruger, in addition to an expanded Board of Advisors comprised of industry experts William (Bill) Mitchell, John Inwright and Jane Gannaway.

Read More

Love robots + UI: The role of AI in the design industry

Abstract robots

The idea of creating a machine that can mimic human intelligence is a mainstay in the field of technology. We have already made the jump from “AI” being a movie from the early 2000s to something we take for granted as it sets our alarms for us on our iPhones. However, contrary to what we may believe, AI is still in a nascent space, and there are still some ways to go with regards to robots completely taking over the design industry.

Read More

Are you doing AI with Humans in Mind?


Whether you’re building or buying AI, if you don’t think about user experience, you might experience no users. So let’s get some insight from Sean Gourley, the founder and CEO of Primer, an AI company focused on natural language processing — you know, machines that can “read” and “write” and maybe someday “use air quotes properly.” From our conversation, here are three lessons for leaders.

Read More

[Use Cases] 5 industry for Conversational AI

Conversational AI use cases

According to Markets and Markets, the expected global Conversational AI market size is set to grow from USD 4.8 billion in 2020 to USD 13.9 billion by 2025, at a Compound Annual Growth Rate (CAGR) of 21.9%. Therefore, companies, industry leaders, and employees need to understand precisely what Conversational AI is, why it’s essential, and the many use cases of this AI application disrupting Healthcare, IoT Devices, Retail, HR, and Finance and Banking Industry.

Read More

Merger between Nvidia and Arm delayed by Brussels

Nvidia - Arm merger

The merger between Nvidia and Arm Holdings has been delayed by Brussels on behalf of the EU, claims a report by The Guardian. The deal in question is to allow the computer systems design maker to take over Arm from SoftBank, while giving out a majority of its shares to the Japanese investment giant. This new hurdle has apparently forced the British semiconductor chip-maker to consider going for a stock market float, most likely in New York Stock Exchange, people close to the matter say.

Read More

No More “What” Without the “Why”

The book of why

Image by authorIt’s Time that Leaders Unite Machine Learning and Causal InferenceThroughout the last months, I had the chance to enable various organizations and leaders leveraging their large databases with machine learning. I was particularly engaging with member organisations which struggle with rising dropout rates (churns) — an issue that became even more serious throughout the pandemic when individual income has been on a declining and the fear of job loss on a rising path.With machine learning, we used very large membership databases with individual-level information (e.g. age, gender, occupation, marital status, postal code, etc.) to identify the ones with high dropout risks to target them ex ante. A classification problem par excellence.Machine Learning tells us the “What”, Causal Inference the “Why”Despite the overall good performance of the machine learning models, our clients were always interested in one obvious question: Why does an individual member leave? Unfortunately, machine learning models are not suited to identify the causes of things but rather they are built to predict things.However, knowing the reason for leaving is of immense business value as it determines the strategic decisions that leaders have to take. For instance, if someone ends her membership because she is moving abroad, offering lower membership fees will do little to nothing to keep her as a member.These experiences made me realize that combining the predictive power of machine learning to know the “What” (e.g. individuals with high probability of leaving) with the methods of causal inference to understand the “Why” (e.g. reasons of leaving) is essential to use the massive datasets within organisations to their fullest potential.Measuring correlation is easy, measuring causality is notTo stick to our churn example, an organisation might be interested whether an increase in its membership fees — one potential explanatory variable — may lead members to leave — the outcome variable. To estimate this causal relationship is anything but trivial. As we all know from Statistics 101, correlation is not causation — and, in fact, the absence of correlation is also not the absence of causation. Thus, the underlying issue of measuring causality cannot be solved with more or even better data; it is also not a (predictive) modelling problem per se which machine learning has turned out so successful in recent years.[photo credit]Instead, the reason why causal effects are hard to measure is endogeneity. And one of the most substantive sources is the omitted variable bias, which arises when an unknown omitted factor influences the independent variable of interest and the outcome variable at the same time.For instance, let’s imagine an organisation comes up with a new explanatory factor — apart from membership fees — and would like to investigate if the age of an individual increases the likelihood of leaving an organisation. A commonly used method such as the ordinary least square (OLS) regression will not deliver meaningful results because both age and the probability of leaving may be influenced by an unknown confounding factor such as individual income. This is because younger people usually earn lower salaries which we often do not have information about.Unfortunately, the omitted variable bias is present in almost all empirical analysis when dealing with observational data. For machine learning applications, this is not much of a problem per se because all we want is a good prediction of the outcome variable — the “What”. However, since we often need to know the causes of the “What”, understanding the “Why” is essential to enable organisations to adopt strategic actions in their favour. Therefore, beyond conventional ways of measuring associations, alternative approaches are highly needed to reliably quantify the causality.Many organisations run the best experiments for causal inference without even knowingTo answer this question, let’s dive a bit into the broad spectrum of approaches to credibly measure causal relationships. Nowadays, most empirical scientists would argue that randomized control trials (RCTs) are somewhat of the gold-standard of causal inference. The logic behind RCTs is rather simple: the researcher splits a sample of individuals randomly into two groups and then gives one group the treatment of interest — called “treatment group” — but not the other one — called “control group”. The differences in the outcome between the treated and the control group is then considered as the causal effect of the treatment. If you closely followed the news about the efficacy of the COVID-19 vaccines, you will have noticed that these studies usually use the same research design: clinical RCTs.[photo credit]Despite their massive strength and credibility in measuring causality, conducting such controlled experiments is often very expensive, sometimes ethically questionable, and most of the time even impossible. For instance, if we want to understand whether the age has a causal effect on the likelihood of ending the membership using an RCT, we will need to change the age of randomly selected individuals in the treatment group. Obviously, we cannot change the age of a person. Therefore, we need different approaches.Most of the time, such approaches are broadly called quasi-experimental research designs which are often the only way to measure causality with observational data. One of the earliest and most astonishing applications of these methods dates back to the ingenious London doctor John Snow who has used water-supply data to find out that Cholera was transmitted via water instead of air (which was the dominant view in 1854) — without requiring one single look at the virus through the microscope. His discovery has most probably saved millions of lives.While researchers nowadays are desperately trying to live up to the ingenuity of Dr. Snow, applying such quasi-experimental methods is anything else than trivial. However, the good news is: many organisations are in fact conducting very large-scale quasi-experiments — often without even knowing. The analysis of many of these datasets enabled astonishing discoveries in business, economic, and political sciences in the last quarter of a century.For instance, Hartmann and Klapper (2018) and Stephens-Davidowitz et al. (2017) quantified the returns to television advertising using the super bowl as a natural experiment. Anderson and Magruder (2012) were able to find that an “extra half-star rating [on Yelp] causes restaurants to sell out 19 percentage points (49%) more frequently” by adopting a clever regression-discontinuity design. And in a very recent study, Garz and Martin (2020) measured the causal impact of media reporting on vote choice for the current government which is highly relevant for political campaigning for parties in democracies around the world.[photo credit]Leaders need to ensure we use the best of both worldsAll these examples highlight that understanding causal relationship can have large implications for the strategy an organisation adopts to compete successfully in the future. While AI-based solutions have already received a lot of interest beyond academic borders, causal inference has not yet been extensively leveraged for data driven decision making.However, to predict the “What” and to understand the “Why” is — in my opinion — the most comprehensive and valuable way of exploit the wealth of information our organisations are sitting on nowadays. The challenges ahead are huge. But so are the datasets. We have the computational resources. And the (quasi-)experimental setups.Leaders need to ask themselves: How can we combine the predictive power of machine learning with the strengths of causal inference to leverage our datasets in a post-pandemic world? To stay ahead, we should no longer ask for the “What” without the “Why”.Reading suggestions- The Book of Why, by Judea Pearl- Causal Inference — The Mixtape, by Scott Cunningham- Mostly Harmless Econometrics: An Empiricist’s Companion, by Joshua D. Angrist and Jörn-Steffen PischkeWant to try out how to find the “What” and the “Why” to your data? We at LEAD Machine Learning are experts in the field of Data Science, Machine Learning and Causal Inference and are happy to connect with you on how you can exploit your large data resources and create real value for your missions ahead.

Read More

My two EdTech adventures

Child at a computer

I have been thinking a little about the impact of the digital technologies on education, it has been significant and with the advent pandemic ubiquitous. I am interested in NLProc (Natural Language Processing) and have been pondering it’s applications in pedagogy and education a little. These brought back some memories of what can loosely be considered my Edtech Adventures.

Read More

Deadline 2024: Why you only have 3 years left to adopt AI


If your company has yet to embrace AI, you’re in a race against the clock. And by my calculations, you have just three years left. How did I arrive at 2024 as the deadline for AI adoption? My prediction — formulated with KUNGFU.AI advisor Paco Nathan — is rooted in us noticing that many futurists’ J curves show innovations typically have a 12-to-15-year window of opportunity, a period between when a technology emerges and when it reaches the point of widespread adoption.

Read More

Democratization of Artificial Intelligence: Is it a boon or bane?

Head graphic

Anyone who has a spark to learn AI in and out could do it just with the readily available sources. This is the exact concept of the Democratization of Artificial Intelligence. This is where AI education is accessible to anyone and everyone. But here, is the actual problem arises: Democratization of Artificial Intelligence: Is it a Boon or Bane.

Read More

The gap between Data science and the organization

Scrabble tiles

The term ‘Data scientist’ was nonexistent when I started my journey in Data analytics space but now it is so called the ‘sexiest job after the decade’: probably after space crews in SpaceX and Virgin Galactic! Data has always fascinated me and I am sure it will continue to do so for many years to come. Throughout this journey I have seen many projects flying off as well as falling apart at various stages. VentureBeat’s quote of 2019, still stays true: ‘87% of data science projects never make it to production’, and there are several reasons which need serious intervention and fixes, to improve this number.

Read More

Small businesses use AI Tools to increase their leads by 50%

Man shop and laptop

Artificial Intelligence (AI) tools and resources have become indispensable to today’s industry. The 2019 study by Gartner shows that in the last four years, the use of AI has increased by 270%. In the last year alone, the number of organizations that have deployed AI in some way has more than triples from 4% to 14%.

Read More
1 2 3 32