Agile Data Science: What does it mean? How to manage Data Science Projects


This post was originally published by Ilro Lee at Towards Data Science - Medium Tagged

Photo by İrfan Simsar on Unsplash

Does it even work?

“What does agile data science mean?” you might be asking. In one word: agile! Agile is a methodology that has been embraced by many industries, including data science. It’s time to get agile with your data science projects and start increasing efficiency and decreasing costs. This blog post talks about what agile data science is, how it can help you manage your projects better, and tips around how it can be used in the context of your company’s culture.

What I mean by agile data science is that the agile methodology can be applied to data science projects. For some people, this might not sound like anything exciting, but for some, this could be a game-changer.

The agile manifesto states:

  • “deliver working software frequently”
  • “customer collaboration over contract negotiation”
  • “responding to change over following a plan”.

This is essentially agile data science in practice, what agile data scientists do on the daily basis. They collaborate with their clients and deliver working software frequently…I acknowledge that there is a wide range of project types, expectations, and skill levels of those involved.

This means agile data science focuses on iterative development and delivering working software or solutions frequently. This makes agile projects more like small startups than traditional waterfall projects where the client only sees the end result at the very end of a project, which can take years.

Agile data scientists are also less concerned about upfront requirements gathering because they know that requirements are likely to change. They instead focus on agile data science minimum viable products (MVPs) that are the smallest solution possible for their clients’ needs and then iterate based on feedback from their client, which makes agile data scientists more like product managers than traditional software developers or engineers who focus much more heavily on planning work upfront.

Agile data science project management can be described as a flexible and efficient method for managing data science projects. It has become popular in the last decade as many software programmers have realized that just because they can do something doesn’t mean it should be done. They have learned that an agile methodology is a great tool for managing projects and coming up with high-quality products in short time frames, despite limited resources or strict deadlines.

The agile method also works well for agile data science projects because it allows a lot of flexibility and encourages creativity while maintaining the necessary structure to ensure timely completion. This type of project management is especially useful in an environment where new technologies are emerging constantly, which might make long-term planning difficult. In addition, agile data science requires agile project management to work well because data science projects are usually quite complex and require a lot of creativity.

The agile method is flexible enough that it can be adapted for agile data science without too much difficulty, but there are some hurdles related to the specific nature of these types of data science projects. For example, agile data scientists often have different goals than agile software developers, which can make it difficult for agile data scientists to learn how best to work within the agile environment. There also needs to be a specific plan in place that leads from one sprint (or iteration) into the next.

Additionally, when the agile methodology is not well applied, agile data scientists can feel frustrated with the agile process. As much as we want to have well-defined goals, agile data science projects are often in an agile environment because the goal and the roadmap are not well defined.

This doesn’t mean that agile methodology is not valuable for agile data scientists, but rather that agile project management needs to be adapted specifically for this type of work by making sure there’s a plan and structure in place without too much process overhead. It also means that agile data scientists need to learn agile project management practices in order for agile data science projects to run smoothly.

In the end, agile methodology and agile data science share a lot of common qualities that make them well suited for each other: flexibility, creativity, and speed while maintaining structure and planning.

When it comes to planning an agile data science project, the first step is the most important step but it’s the most important step irrespective of which methodology you use. You have to clearly define the business problem and the project objective. Define what problems your client or product owner needs to solve and how you can help solve them in an agile way using data science techniques. Once that’s done, then you have a clear vision of where you’re going which should be easy enough for everyone involved to get on board.

Once the business problem and objectives are clearly defined and signed off by either your product owner or client, you gather requirements and prioritize them. For those who are not familiar with all the terms, they are called backlogs. You create user stories to capture requirements. I am assuming that your workplace has a software solution to manage this, but if you don’t, I found JIRA to be an excellent tool to manage agile projects.

You then go through iterations — sprints (2–3 weeks per sprint), daily standups, and a review after each sprint. You repeat until the project is done.

Applying agile methodologies for data science projects can be extremely frustrating for the data science team who is delivering as well as for the high expecting clients. You might be in a situation where you don’t have a choice of which methodology to apply for your project. So consider the following.

  • Be transparent; share information as soon as possible with all stakeholders, even if it is not perfect yet. This helps everyone keep track of what has been done so far and how decisions are made by providing insight into the thought process behind them. Transparency also implies that everyone is aware of the risks and how these will be mitigated.
  • Keep agile sessions short, around one hour once a week during sprint planning and daily 15-minute standups (for agile teams that work in shifts). People need time to digest information, ask questions and give feedback — don’t expect people to do this while having an agile session because they will stop paying attention.
  • Have agile sessions face to face; remote agile is possible, but it’s hard and doesn’t really work well in the long run (remote agile might be an option for data science teams that are spread across different locations, considering the pandemic and all).
  • Agile projects require high levels of commitment from all team members, especially product owners. Seek out a top-notch product owner who can speak the language for you and the business. By nature agile process requires effective expectation management through strong and continuous stakeholder management. Data scientists should make a point to meet with the product owner, at least once a week.
  • Team agile is more important than individual agile; agile projects are about team dynamics and data science teams that employ agile methodologies need to ensure they have good communication skills as well as technical abilities. Know your group’s limitations and be aware of how much agile your data science team can handle.

We hope this blog post has helped you learn more about what agile data science is and how it can be beneficial to your team. Agile data science can help with project management, increase efficiency, decrease costs, and keep up in the competitive field of analytics. If you want to know more or have any questions about anything we covered here, don’t hesitate to leave a comment below!

Spread the word

This post was originally published by Ilro Lee at Towards Data Science - Medium Tagged

Related posts