Skip to content

Succeeding in a Data-Driven World: Lead Data Architect Rik talks about the Challenges and Opportunities for Businesses in 2023

Are you curious about the benefits of becoming a data-driven company in 2023? Read this article featuring Lead Data Architect Rik, where he talks about the challenges and opportunities of data-driven decision-making. Discover how to ensure the accuracy, completeness, and relevance of your data, and learn how emerging technologies such as artificial intelligence, machine learning, and the Internet of Things can help you get more value from your data.

What does it mean to become a data-driven company, and why is it important in 2023?

In today’s fast-paced business world, becoming data-driven is crucial for success! By harnessing the power of data, you can make smarter decisions and gain a comprehensive understanding of your customer base. Don’t rely solely on intuition – it’s too risky! Instead, access all the data available to gain valuable insights into the context of every client. By doing so, you can move beyond a limited perspective and make informed decisions that positively impact your business. In 2023, it’s more important than ever to embrace data-driven decision-making. By failing to do so, you risk falling behind the competition who are already utilising data to their advantage. Don’t miss out on the competitive edge that can be gained from analysing all the available data. 

What are some of the biggest challenges companies face when trying to become data-driven, and how can they overcome these challenges?

Let’s start with the easiest step in the process to become data-driven: acquiring data. Many clients have terabytes of data that they’ve collected but aren’t utilising it effectively. The biggest challenge is identifying the data that’s most relevant to solving a specific problem and focusing on that data. Rather than taking the easy route of importing all available data, it’s important to guard the scope and limit the data being used to deliver actual results. For that, it’s crucial to start with business concepts and goals. For example, if a customer wants to become more data-driven, it’s important to understand what that means to them. By limiting the scope to a specific area, such as sales or cross-selling, businesses can focus on the relevant data points needed to achieve their goals. This approach allows for a more targeted and effective use of data, rather than drowning in a sea of unnecessary information.

At Devoteam G Cloud we first meet with the customer to understand their goal of becoming more data-driven. We try to clarify what they mean by this, whether they want insights on all areas of their business or just a specific area such as sales. For example, we might focus on understanding why certain sales are successful, which customers are buying the most, and what cross-selling opportunities exist. We then narrow the scope to focus solely on cross-selling and determine which specific data points we need to achieve our objectives. We assess whether we need to capture new data or if the required data is already available. By starting with the business goals and desired actions, we can determine the necessary data points and avoid getting lost in a sea of irrelevant information.

How can a company ensure that its data is accurate, complete, and relevant?

The importance of data relevancy cannot be overstated in business. It is crucial to determine whether the data is relevant and useful to the business. If the data is relevant, it requires testing to ensure its validity and accuracy. There are two types of accuracy: accuracy loss due to incorrect transformations and accuracy loss due to incorrect data input. Providing people with access to data helps them understand the value of clean data, leading to better data quality. It is important to make someone accountable for data quality and monitor its progress to take necessary actions to improve it.

Managers are often scared about data security implications and have difficulty knowing what data they have due to hoarding. To solve this, it is essential to translate the technical terms into business terms so in a next phase you can easily define the sensitivity of attributes, and determine who can access them. Defining business terms helps the security team understand the sensitivity of data attributes, which leads to better control over data access. When presenting a table to the security team, it is crucial to ensure they understand its meaning, not just the database administrator.

What role do emerging technologies such as artificial intelligence, machine learning, and the Internet of Things play in the move towards data-driven decision-making?

IoT, AI, and ML are different yet related concepts that contribute to getting more value from data. IoT makes it easier to collect data since sensors are cheaper, and it shows that it’s easier to get more data into a platform. On the other hand, AI and ML use the collected data to gain more insights easily.

AI and ML are an extension of data engineering, and they help to get a lot of the same results within data engineering, but faster. With AI and ML, you don’t have to understand your data to the maximum; instead, you identify what is valuable, and the models will learn how the pieces are related to each other to make more value from your data.

Ultimately, it is all about getting more return on investment. By leveraging IoT, AI, and ML, you can get more value out of your data. It is no longer enough to just collect data; the real value lies in the insights that can be drawn from the data. Therefore, investing in these technologies is necessary to ensure that you get more value from your data, making your business more successful.

How can companies ensure that their data is secure and compliant with regulatory requirements, and what technologies can be used to achieve this?

When it comes to managing data, there are a few key considerations to keep in mind.

  • Firstly, lineage is crucial. You need to know where your data is coming from, where it’s going, and where it ends up in case something needs to be revoked. For instance, if a customer revokes access to one of your platforms, you must understand what data is affected. Luckily, Bigquery lineage is now available in preview, which means you can see how data moves through different tables and platforms.
  • In addition to lineage, column masking and overall security are essential. With Bigquery, you can define specific security groups based on Google groups, which can be assigned access to different severity levels. For instance, Level Three clearance might allow access to primary email addresses, but only in a hashed form. Level Four clearance might enable viewing in plain text, while Level Five could allow access to revenue data. By creating and documenting these rules, you can ensure that your security teams understand them and that you remain compliant with regulations.

Overall, managing data can be complex, but with tools like Bigquery and a clear understanding of lineage and security, you can keep your data safe, secure, and compliant.

How can a company encourage a culture of data-driven decision-making throughout the organisation, and what role do executives and managers play in this process?

One crucial aspect of data management that often gets overlooked is the need for clear communication around what data is available and what is not. To facilitate this, a data dictionary or glossary can be incredibly helpful. These resources allow users to search for specific data within the platform and access it without needing to go through IT or security systems for clearance.

When it’s easy for users to access data, they’re more likely to use it. If getting access to data requires sending tickets or going through a lengthy process, people may turn to others who already have access instead of starting the process themselves. This can create inefficiencies and slow down decision-making.

To address this, organisations should prioritise creating user-friendly data dictionaries or glossaries that are easily accessible to all users. By doing so, they can empower their teams to make data-driven decisions and achieve their objectives more efficiently. Don’t overlook the importance of clear communication when it comes to data management.

What are some of the key metrics that a CTO should track in order to measure the success of a company’s data-driven initiatives?

When it comes to data management, there are several key metrics that organisations should pay attention to. First and foremost, the number of distinct people who are accessing the data platform should be high. This indicates that the platform is being utilised and that people are finding value in the data it provides.

On the other hand, the number of dashboards provided by the company should be kept low. While it may be tempting to create a dashboard for every possible use case, maintaining them can be difficult and time-consuming. With more dashboards you also increase the chance of having multiple interpretations of the same definition which results in contradicting dashboards. Therefore, it’s essential to strike a balance between the number of dashboards available and their relevance to the organisation’s needs.

Another critical metric is the number of feature requests that the organisation receives. This indicates that people are actively engaged with the data platform and that there is a continuous effort to improve its capabilities. However, it’s important to manage feature requests effectively, prioritising those that will provide the most significant value and avoiding overloading the development team with too many requests.

Data quality is also a key metric, although it can be challenging to define and measure. While every company may have a different approach to calculating data quality, it’s crucial to ensure that everyone is using the same definition and that the quality is improving over time.

Lastly, data availability and trustworthiness are crucial metrics. People should be able to access the data platform when they need it, and they should trust the data they find there. If the data platform is frequently down or if the data is incorrect, people may lose faith in it and stop using it altogether.

In summary, organisations should focus on metrics that indicate high utilisation of the data platform, low maintenance requirements, ongoing engagement with the platform, improving data quality, and ensuring data availability and trustworthiness. By doing so, they can build a data-driven culture that supports their objectives and helps them achieve their goals.

What advice do you have for companies that are just beginning to explore data-driven decision-making, and what steps should they take to get started?

To build a successful data platform, there are a few key metrics to keep in mind.

Firstly, the number of distinct people accessing the platform should be high, while the number of dashboards provided by the company should be low. It’s important to resist the urge to create too many dashboards, as they can become difficult to maintain and lead to conflicting information.

Another important metric is the number of feature requests or tickets received, which should be high enough to show that the platform is constantly improving and evolving. Data quality is also important, although it can be difficult to measure. It’s essential to use a consistent definition and strive to improve over time.

When starting out, it’s important to begin with one dashboard and focus on getting that pipeline working for a specific use case. Don’t be afraid of making mistakes early on, as it’s better to catch them when working on a single dashboard rather than after the platform has been widely adopted.

Security should also be a priority from the beginning. Start with basic principles, such as defining who has access to which data, and iterate on these principles as the platform grows.

Finally, it’s important to stay focused on business goals and not get caught up in the hype of being data-driven. The ultimate goal should be to use data to make informed business decisions and generate revenue. With these metrics in mind, it’s possible to build a successful data platform that is trusted by users and delivers real value to the organisation.

What are the most important technical considerations when implementing a data-driven strategy, and how can companies ensure that their technology infrastructure is capable of handling large amounts of data?

When it comes to managing data, there are a lot of factors to consider. The first thing to determine is how much knowledge you have in-house. If you don’t have the expertise to maintain a custom solution on Kubernetes, then it’s better to opt for a managed solution like Bigquery. However, if you need a solution that’s available on multiple clouds or on-premises due to regulatory considerations, you may need to go for custom development on Kubernetes.

Unfortunately, some companies are hesitant to use Bigquery due to vendor lock-in, and instead opt for solutions that are difficult to maintain and comply with regulatory requirements. I suggest to go for a managed solution until you have a reason not to. Managed solutions will help developers and security teams deliver features faster and lower the burden on administrators. What the recommended solution would be depends on what you want to do with the data.

For example, if you want to provide historical insights and trends, a data warehouse can do that. If you’re working with BI reports, then it’s best to use a column-oriented database like Bigquery. However, if you’re dealing with inserts, updates, and deletes, then a row-oriented database like Postgres is more suitable.

When it comes to choosing a database engine, it’s important to consider how you want to access the data. Additionally, you’ll need to determine how up-to-date the data needs to be, which depends on your data pipelines.

It’s also important to consider what kind of knowledge you already have in-house. If you have experience with a certain framework, there’s no need to retrain everyone on a new framework that’s more or less the same.

Finally, a new database called AlloyDB that’s being promoted by Google, promising a combination of row-based and column-based storage. In the end, it’s all about weighing your options and finding the best solution for your specific needs.

Can you provide any examples of companies that have successfully become data-driven, and what can we learn from their experiences?

What kind of ROI can businesses expect to see from investing in becoming data-driven, and how can they measure and track this ROI?

To succeed in the data-driven world, it is important to focus on the areas that generate data revenue. While it would be nice to have all data available for everything, that is not always possible. Therefore, it is crucial to determine what areas to concentrate on, such as serving customers better, upselling, cross-selling, and attribution. Depending on the focus area, it becomes easier to calculate the return on investment. For example, if the goal is to increase leads, tracking how many leads were generated by increasing the marketing budget is an excellent way to measure ROI.

On the technical side, investing in a data dictionary may not always seem like a priority from a business standpoint, but it can result in a decrease in support tickets. This can be a significant ROI as the same knowledge can be applied to the data, and fewer support engineers are needed. Additionally, tracking accuracy with AI models to predict stock prices, for instance, can result in significant returns if predictions improve.

It is crucial to track ROI based on the focus area of the business. By doing so, it becomes easier to determine the impact of investments made in different areas, such as marketing, customer service, and technical support. Ultimately, data-driven decision-making is crucial in today’s world, and tracking ROI helps businesses stay on top of their game.

How do you see the evolution of data-driven technology and tools impacting businesses over the next few years, and how can companies stay ahead of this evolution? / What do you see as the future of data-driven decision-making, and how should companies be preparing for this future? What will it look like 10 years from now? 

In the world of data engineering, there is a common struggle with dashboards. While they can provide useful insights, they are difficult to maintain and often require someone to build and update them. What if there was a solution that could bypass dashboards altogether? Imagine if there was a bot or AI tool, like Bard or Chat GTP, that could answer questions on the spot, without the need for a dashboard. It would be a game-changer for data engineers, who could spend less time building and maintaining dashboards and more time understanding what people actually want from their data.

Looking to the future, this type of AI tool could become a reality within the next 10 years. It would be incredible to have a chatbot that could help people easily access the data they need, without requiring a technical background in SQL or table joins. By providing the data in a single structure, people could combine it themselves and gain insights without struggling with complex dashboard interfaces.

In addition to the potential for chatbots, streaming data will continue to become more important as compute costs decrease. However, despite advances in technology, some challenges will remain. Ensuring proper governance and security protocols for massive amounts of data will continue to be a concern, especially with the introduction of new data privacy laws, such as the right to be forgotten.

The future of data engineering looks bright, with the potential for innovative solutions that will make accessing and utilising data easier and more efficient than ever before.

Are you struggling to make sense of your data? Do you feel like you’re missing out on generating business value?

Let us help you build a winning Data & Analytics Strategy that generates revenue, operational excellence or optimises financial performance. With years of experience in data and analytics, we know how to turn raw data into actionable insights that drive growth and success.