Build Consulting was pleased to present virtually at Good Tech Fest, May 18-19th, 2021

Last year more than 1,500 people from 42 countries participated in Good Tech Fest and this year was the most global, representative, and accessible event yet!

If you’re new to the Good Tech Fest community, they work to utilize data and technology for social impact. Unlike other nonprofit or social sector technology conferences, they are very much focused on program and field technologies. Basically, how can we use technology to further our missions?

Good Tech Fest is an impact conference first and foremost. So if you’re a program manager, developer, product guru, data scientist, visualization expert, grantmaker, founder, or just plain impact nerd, join Jo Butler and Peter Mirus at Good Tech Fest for these presentation videos, coming soon!

Build Consulting Presentations:

Jo Butler, Build Consulting’s Senior Nonprofit CRM Database Manager, shares strategic CRM adoption tips to get your CRM used by everyone at your nonprofit organization.

Strategic Tips for CRM Adoption at Your Nonprofit

View this video directly on YouTube (and subscribe to our channel!)

Listen to the Podcast

Like podcasts? Find our full archive here or anywhere you listen to podcasts. Or ask your smart speaker.

Peter Mirus, Partner at Build Consulting, presents an exploration of the dimensions of data quality—completeness, validity and accuracy, consistency, timeliness, and integrity – interwoven with impact examples from the nonprofit and public sectors. Attendees will learn a structure for how to qualify “what makes data good,” and how to talk compellingly about data quality in their organizations.

How Data Quality Defines Your Organization

View this video directly on YouTube (and subscribe to our channel!)

Listen to the Podcast

Like podcasts? Find our full archive here or anywhere you listen to podcasts. Or ask your smart speaker.


  • Jo Butler Senior Nonprofit CRM Manager

    Jo joined Build after working in nonprofits for over a decade. She understands from experience the challenges inherent in fundraising, development, financial reconciliation, and constituent relationship tracking. Jo brings her long standing reputation as the “go-to” person for all things data to her Nonprofit CRM Manager consulting work at Build. More »

Transcription: Strategic Tips for CRM Adoption at Your Nonprofit

Link to Transcription: How Data Quality Defines Your Organization below

Jo Butler:  In this session, I’m going to guide you through some tips to boost user adoption and best practices for your CRM solution regardless of vendor. So, my name is Jo Butler and I am the Senior Non-Profit CRM Manager at Build Consulting. I have worked for more than 13 years at a variety of non-profits focusing largely on improving the effectiveness of their fundraising and development operations. So, I joined Build Consulting about three years ago and have continued to help non-profits through our database management offering. My personal passion is to really help organizations prepare and realize how they can become more effective and strategic by utilizing their rich constituent data to drive better decisions and ultimately their fundraising outcomes.

Introduction to Build Consulting

So, I’m going to give you just a really brief overview of who we are and what we do at Build. Build Consulting was founded in 2015 with the sole mission of helping non-profits like yours to reach their goals. We have a rock star team with both deep data and information technology experience, paired with on the ground knowledge of the daily realities of life at a non-profit. And we help non-profits make the most out of their technology with the following services. Because of our experience, we understand your unique challenge and can ground your organization in what is needed to get the most from your investments. So, with the CIO offering Build works with your leadership and other team members to make sure technology serves the needs of the organization. We have also attentive to how new technology brings organizational change and help guide that change process.

For our Assessment and Roadmaps offering, you know, the first step to successfully selecting and implementing the right technology is to understand what you’ve already got and how well that is serving your organization and its teams. We’ve developed our assessment methodology based on decades of experience working with organizations like yours, helping each develop a clear plan to achieve short-term improvements and long-term strategic goals. The outcome of any of our assessment projects is a really clear vision for the future and a well-defined strategy to get there, and this is the offering that I lead up. We know that it’s often really hard to find high caliber internal database managers, so we can provide that expertise and we’ll improve how you manage and use your data so that you are working more strategically and effectively, strengthening and your relationships with your constituents using that data. Here are just some of the organizations we’ve helped transform their habits and culture in ways that have allowed them to make the best use of their technology.

So at Build, we often refer to our Information Strategy Framework whenever we start any new engagement. And you’ll notice that technology is intentionally listed last. Build believes that the help of any technology ecosystem is a function of five key elements. Most technology problems have their roots in the way decisions are made, the way operations are organized and the quality of processes and data. When those are addressed well, the technological improvement is easier, but it also helps transform your organization’s habits and culture in ways that allow you to make the best use of technology.

So throughout this presentation, I’m going to use real life specific client engagements, where I’ve been the Information Services Manager to illustrate how implementing the following tips and best practices has elevated the CRM adoption.

Improving CRM Adoption

So when I arrived on the first day, I logged into their database of record in 2019 and the only real users of CRM, besides two of us in Information Services, were some Developmental Operations staff. Since then, we’ve on boarded almost all departments, roughly about 85% of the organization and can probably report now, that today we have about 70% of users logging into the CRM at least once a week. But that took a very deliberate, thoughtful, patient adoption strategy.

So, I’m really excited to be presenting a Good Tech Fest to share my knowledge experience and techniques that we implemented at this client, but also additional client organizations that hopefully, you can take back to your non-profit to try.

Barriers to effective CRM adoption and preventative solutions

So, the latest reported CRM failure rate is actually at—an all-time high. I think it’s at 63% and that’s because new technology can be really powerful, but it can’t transform a non-profit on its own. A new system brings new processes, automation, information, roles, and responsibilities, and that’s a lot of change and resistance to change is a given in most CRM adoptions. While there could be a host of reasons why CRM adoption strategies all almost don’t succeed, we’ve discovered some common issues and that is the lack of leadership, engagement and governance, insufficient training and resources, lack of accountability and mistrust of data accuracy and use.

So, I’m going to walk you through some specific tactics and methods that I have deployed to avoid these. So, let’s start with lack of leadership engagement and governance.

So, having executive leadership engagement establishes commitment and encourages collaboration between departments. Without executive support, your CRM is likely doomed from the start. Leadership needs to be the organization’s biggest cheerleader for the use of CRM. And you’ll get that executive buy-in, if you can demonstrate that the CRM will drive growth by resulting in better member retention, more effective marketing, and more funding opportunities. So in order to engage leadership, we really needed to provide them with a tailored executive level onboarding. So, we provided leadership with the level of training needed to understand the role that CRM plays in the ability to track, measure, and evaluate their strategic goals.

Leadership now recognizes how CRM enables staff to provide a comprehensive evaluation report to their Board of Directors. And our internal mantra is if it’s not in the system, it never happened. So, staff and management take their cues from the executive team. So, it’s really imperative that they deliver visible, vocal, and active CRM sponsorship. We also have had the opportunity for Information Services to present at Strategy Council meetings, which are sort of this client’s top-level executives, to provide updates on key CRM projects or to discuss potential roadblocks, or any outcomes from internal meetings, like the Data Governance Committee meeting, which I’ll talk about next.


Importance of Data Governance

So, data occupies a very curious place for many fundraising, program, and marketing teams. They will happily and readily acknowledge its importance, but they also treat it as the exclusive territory of the IT or IS department. They’re very happy to use—make use of their insights but getting to know the ins and outs of an effective data management policy is in their minds, unnecessary. It’s somewhat understandable, but it’s mistaken. So, you need to emphasize that everyone who uses the CRM owns it. I like to think of it like a public traded company. Anyone who has a share is part owner, certain groups hold more shares and therefore more ownership, but who has the ultimate say? So, if each department is asking for a new update, a different workflow, or has conflicting opinions—who makes the call? Which department has the responsibility? This is why I highly recommend establishing a Data Governance Committee.

The data governance forms the basis for org wide data management and makes the efficient use of trustworthy data possible. It’s important to remember that data governance differs from data management. So, data governance acts as the decision-making function that has authority over the data management actions that crossover multiple teams. Data management are those actions taken to execute your data governance framework. So, a Data Governance Committee discusses recommended changes to current procedures, establishment of new coding, and helps define the use of data. The key always is to help maintain that balance between protocol and flexibility and to empower users, to feel a sense of ownership over data standards and quality. So, we formalized a Data Governance Charter with vision and mission. We created a decision-making protocol and we defined, you know, the participant’s roles and responsibilities. This committee should always involve a representative from every department that either has contact with the CRM or the constituents that your organization serves in some shape or form. This really helps break down departmental barriers, freeing data from those dedicated departments silos and creating a uniform or at least a consistent process for accessing and using your constituent information.

We also created the role of the Data Steward and while they serve an important role on the committee, this is someone that each department empowers and supports as their subject matter expert. So, each Data Steward must have a really good understanding of the data and processes their team needs. They are able to represent their department’s needs and concerns when any new recommendation is being brought for discussion or approval at those Data Governance Committee meetings. They should also be allowed time at the end—at their department meetings to report back to their teammates on what was discussed at the Data Governance Committee meeting, and really be a conduit between the community and the department. A successful Data Steward to really enable the CRM to provide reliable data that their department and organization can trust. And the investment in this Data Steward role will pay for itself many times over because better data practices allow you to create efficiency and effectiveness throughout the whole of your organization.

Training and Resources

So, the next barrier to prepare for was insufficient training and resources. It seems like a no-brainer, but no matter how you decide to roll out your CRM, it’s imperative that you fully train your team. A CRM can have a steep learning curve, presenting a challenge, even for relatively tech savvy people. Training is vital to the success of CRM adoption. It’s really unrealistic to force a new process on your team and expect them to be productive right away without structured training. So, we ensured that current staff were trained at the overall basic functionality level, but then we took the time to meet with each department sometimes even at the role level, to provide them with tasks, you know, role specific training. We made sure that everyone in the department understood how their part of the process fits into the overall workflow of the CRM. You know, understanding how their role affects the larger picture was a really important driver for them to enter and engage with the CRM effectively.

So for new employees, we, you know, created a well plan, rock, solid user onboarding experience, which is critical for introducing your employees to the CRM. So, for the hiring directors, we created a checklist that helps them keep CRM onboarding front and center of their new employees onboarding experience. So, you’ll see it’s necessary for directors to be very direct regarding the onboarding of a new employee. It also sets the expectation as to what is needed in order for new staff to gain access to the system and to be trained accordingly, this also approach the director to be really considerate and deliberate about the comprehensive training plan. So new staff being onboarded feel like they’re being incorporated into something that everyone is a part of and the department’s manager and the department’s data steward doing feel ownership over the new hire’s success of their CRM use. It’s also important to make sure that special tasks like merging or deceasing records are only given to the right personnel so, training on those tasks can be incorporated.

It’s also essential to continue training even after the initial software implementation. You will have new employees, people will change roles, and your processes will change. So, employee training is done effectively when it’s reinforced and refreshed consistently. So as with any technology, CRM software is constantly adapting to leverage new capabilities. So, regular training opportunities will keep everyone up to date on any new features. It can also be, you know, a great refresher for those who may not be utilizing the CRM to its full potential. So in addition to the initial training, we keep staff informed of new functionality available by posting brown bag sessions. So things are just, you know, bring your lunch into the conference room—or well, Zoom—its virtual now with everything and we show them how functionality that can help them save time, save effort, automate workflows to remove some of the tedious tasks from their job.

We also have a fortnightly luncheon with the data stewards, where we sort of emphasize that this is a really safe place to discuss issues that they might be experiencing with their department’s colleagues, like not putting their data in timely and sharing knowledge and maybe even some shortcuts that they’ve found. And then we also encourage staff to participate in, you know, the roadmap sessions or any sort of webinars that your CRM software vendor may host and also voting on any of the idea bank submissions that your organization would benefit from.

Communication about the CRM

We also continue to talk about CRM when given the opportunities to present at internal meetings. So for example, at monthly all staff meeting, we use this as an opportunity to inform staff on new CRM functionality, especially when they’re rolling out new improvements. We use this opportunity to show results of an organization wide campaign using data and metrics, you know, right out of the CRM and remind anybody of upcoming trainings or brown bags. We also recommend requesting five to 10 minutes on the agenda to present at each department’s regular team meeting, just to discuss any CRM data improvements or issues that are specific to their departments, processes, and tasks. Another touch point is that Information Services team meets quarterly with department directors and their data steward regarding the CRM and any information systems priorities. So we ask them what has to be working well, things that haven’t been working well and getting ahead of any new projects that they might be planning that they’re not really considering, you know, that data implications. So, you know, plain and simple, if you don’t invest in proper onboarding, training and continued communication and spotlight on your CRM, you’ll struggle to gain adoption and get buy-in and just leave a ton of value on the table.

Documenting CRM policies is critical to success

So during these trainings, we have by our side and always reference our CRM policies and procedures manual. So, documenting policies and procedures really helps ensure data consistency and increases efficiency. And while every organization’s policies and procedures are unique, the manual should provide details on how your organization specifically enters and maintains records and performs standard database functions, like address processing. They should not be a recreation of the standard overall training documentation that your CRM vendor provides. The manual should be really specific on how your non-profit uses, you know, constituent categories, for example, how they are defined and applied. So for this one client, they are using Raiser’s Edge. So, we have manuals both for the database view and Raiser’s Edge NXT. We provide screenshots and a step-by-step entry guide because they have, you know, code table definitions and, you know, their expected use. And we also created these sorts of one-page cheat sheets for just 30-minute data entry drills, you know, prompting with questions that are specific to a unique task, like reporting actions. So, they do this, you know, we would prefer—they do it as it happens, but most of the time it’s every three days—every week they use this and go through their activities for the previous week and into the next.

We also have to sort of update your manual when a new or changed process arises. So, it’s really a living document that’s stored for anybody in a central location, so that it’s easy to find. So with this particular client, we use Microsoft Teams to store all the volumes of guides and manuals. You can also remind your users that it’s okay to ask questions and more importantly, ask for help, no matter how long it’s been since their training. So include some instructions for a way—for database users to follow when they need help troubleshooting issues or they have data needs outside of their responsibilities or their skill set.

So for that, we use Teamwork Desk to help track and prioritize requests in a timely and transparent manner. And it also takes those requests out of, you know, two- or three-people’s email inboxes. Other staff don’t need to have access to this. We just set up a dedicated email, for example, and then we forward that email to your Teamwork Desk, and it creates a service ticket. Then you can tag it with the department and the type of request. And so, we also recommend creating a virtual collaboration space. So for this, we use the conversations feature within Microsoft Teams. And so, we create a channel called Ask a Colleague, where staff can ask each other questions and learn from their peers. We found that this really fosters community and collaboration. It also takes some of the stress off the few of us, you know, Information Service Managers as being the only resource that can help with any CRM. You really don’t need to be an expert to provide a simple, time-saving tip.

Addressing accountability and incentives

So now we had staff all trained up and ready to go, but now we have to people to use the CRM, which brings us to facing the lack of accountability and incentives. So, building a culture of accountability at every level of your non-profit is essential to success in setting up your CRM adoption for true victory. It requires all staff to feel as though what they do or what they don’t do has consequence. So, performance reviews provide a great opportunity to make sure employees know where they stand, both with the successes or failures. So, we worked with human resources to incorporate into job descriptions, the responsibility for data entry and CRM use. This really ensures that CRM responsibilities are non-negotiable and now they’re tied to annual performance reviews. So those incentives should always be reserved for staff who meet or exceed the expected data entry standards established by your organization. Otherwise, what’s the point of having those standards at all?

Often, you know, rewarding employees for non-performance really discourages those high performers from trying. And it also spreads subpar performance among staff who aren’t putting their best foot forward. It’s really important to focus and understand how you will measure the impact of your accountability improvements. So at this client, at the beginning of the year they put together their annual work plan. So, Information Services will meet with each department after they’ve completed it. It will help align their objectives and goals within the CRM. So, we help set up or improve any data structure in order for them to help enter, track and report on the goals and objectives in those work plans. Ultimately, you really need to tie CRM both to the tactics and the outcomes. So, you can measure the effectiveness of your efforts.

So, whether it’s in a one-on-one or full team meeting, incorporating the use of CRM as a management tool in real time will bolster incentive and accountability. So based on the purpose of those meetings, we created a variety of reports and dashboards that will help measure and monitor those activities. So metrics such as evaluate the success of events, how many of your names did we gather, total donations, petition signed, and report on responses to appeals. There are dollars raised by which segment, which message performed better, where prospects are in the pipeline, who needs to thank you call, who was recently wealth screened that needs a major gift officer assigned or who was recently disqualified. Using the CRM live instead of coming to meetings with printed reports or attached spreadsheets really emphasizes real time interaction with the data. So, by using those dashboards live in team meetings, everyone can see that Minnie isn’t entering her actions and then look how it affects the department’s overall dashboard.

And then it’s up to the manager and director to address that in the meeting, there is embarrassing whoopsies, only have to happen once or twice to incentivize staff to get the entries in, especially before those meetings. So, that focus is more sort of on how the CRM solution will bring about change to the organization rather than how users will completely adopt the new technology. And it really can be a challenge to measure CRM adoption because your team might be adequate using the software, but not really maximizing its use.

So, there are various metrics that you can track on short term and longer term to see whether your CRM is adequately utilized or not. So not just to report on who’s logged into the system, but who has entered in actions, who has entered in a follow-up action, who’s updated and constituents’ email or phone and added a communication preference with solicit code, and then also added a supporting note detailing why that was applied, updating relationship record, making sure they first searched for the constituent in the system before adding it. You can check when they last ran a pipeline or a particular report and then pull a query on what they did in the system at the pooling that. You know, holding these as accountable to the expectations you set will remind them that the CRM is here to stay.

So now we have leadership onboard and cheering away about CRM. We have everybody trained and have a clear understanding of what they are required to do as part of their new responsibilities. They log into the system and the data is a mess. So, naturally they think why bother my spreadsheet is correct and the information in there is better. And that’s brings us to our final barrier, which is the mistrust of data accuracy in use.

The mistrust of data accuracy and use

So CRM soft, this software itself is not always the reason why adoption fails. Sometimes it’s the data sources that are to blame. The data quality is a major contributor to a dissatisfaction even when the system is actually working fine. A good example of this incomplete or incorrectly entered constituent data, which can make staff skeptical about all the information in the CRM, as if the names are wrong or the phone numbers are tied to a prospect record for instance, it’s not hard to see why a fundraiser would be reluctant to rely on the CRM to make appeal calls.

So, the efficient management of data is an important task that requires centralized control mechanisms. So, in Information Services—hold on, just wanted to make sure you can still see my screen and hear me. So, data quality needs to be attacked at three levels. So, one never let data whether it’s within the initial migration or conversion or any subsequent input into the system without eyeballing it and scrubbing and cleaning it up first. The second is just spot sources of data pollution and systematically correct them. There are some automation and tools that will help, but really rely on those data stewards who both know about the meaning of data entries and care about the quality. In CRM, there’s no such things self-healing data. And then the third is to identify business processes for data interfaces that corrupt the semantics of CRM data.

So, we created and implemented a series of queries that identify where data needs to be cleaned or helps to identify where more training might be needed. So these are run weekly, quarterly, monthly and we use Teamwork Projects to keep track of that schedule. We can provide you with the link to some examples of those data hygiene queries as part of a follow-up to this presentation. Another giant barrier to CRM adoption is if the data is and that’s coming from different sources needs to be manually recreated or re-inputted again. So, we make sure that data coming into the CRM from external sources are working successfully. And this helps with data hygiene and staff’s trust in the data.

If your CRM is compatible, I would highly recommend the purchase of ImportOmatic. This is an amazing tool. I don’t get paid to plug this product, but it’s saved me a lot of time. You can create data dictionaries to help translate common misspellings of table values into your configured values in the CRM. It helps with proper casing. You can add additional data that’s not necessarily in the input file, like adding an attribute or constituent code, or, you know, if you’re importing a group of people from an event, you can also add certain communication preferences. It’s really just a great tool for managing data upon the import.

And then it’s also important to regularly screen national change of address, age finder, and deceasing—any data service that your non-profit could benefit from. So for one client, knowing what congressional district people live in, is a data finder that we use. So, while there are probably only a few people in your organization who have permissions to import data in bulk, it’s very important to stress that the health and success of the CRM is everybody’s responsibility. If you see something, either fix it right then and there or say something.

So, as we come to a close here today, I really hope that this outlined of proven successful tactics to increase your organization’s use of CRM is helpful either with your initial CRM implementation or a readoption strategy. I’m available for some questions, otherwise, feel free to email me, check out our website for some resources that will help you socialize these concepts internally or even roll them out straight away. But I just really want to thank you for your time and attention today. And yeah, let me know if you have any questions. Actually, I’ll just keep that up.


What favorite CRM platforms that a reasonable price for non-profits?

That’s a really sort of loaded question. It depends on whether you, you know, what services you offer. Do you have a membership service? Do you manage volunteers? Do you do peer to peer fundraising? Do you have an advocacy component? So, it would really depend on what your non-profits activities are. Yes, Jenny’s right, Salesforce offers 10 licenses, free licenses to non-profit. Yeah, take your time. Trying to get away from specific—we’re very vendor agnostic at Build. So, it really depends on the specific client’s needs.

Yes, I can definitely this share PowerPoint on Slack.

So, ImportOmatic is not only the Razor’s Edge. They have plugins for Engaging Networks, for example. And I think it’s compatible with a lot of different CRM products. There are some advantages. There are some direct connections. So, they have a direct connector with Luminate Online and Engaging Networks, for example. But you can also just use it as, you know—download a CSV file and import that CSV file into your database.

Katie said that these suggestions are helpful, but with smaller organizations the committees often going to overlap. So, what I would do is if there’s a Digital Communications Committee or an External Relations Committee, just making CRM or making data, just a small portion of those regular standing meetings. So it’s not—you don’t have to be a part of another committee and take up another hour of your time. I would just carve out a small portion of those, those allocated meetings to directly talk about CRM or data or technology.

Yeah. If you don’t or if you’re too shy to ask questions now over chat, please feel free to email me. I’ll be happy to answer any questions. I’ll make sure this presentation’s available on Slack. Yeah, maybe you’ll get some time back so that you can grab some lunch before your next session, but again, thank you so much, everybody for your time and attention. I hope it’s been wonderful.

Transcription: How Data Quality Defines Your Organization

Peter Mirus:  All right. Again, welcome everybody and thanks for joining. This is session titled “How data quality defines your organization?” I hope you can see that on my screen right now.

My name is Peter Mirus and today I will be introducing a simple framework for understanding data quality that you can use to drive data quality conversations as your organisation. Hopefully leading to improvements in the quality of your data as well as greater mission impact because that’s what we’re all about.

Just a couple of housekeeping notes before we get started here. Please ask your questions or provide comments as you like via the Good Tech Fest Slack channel—How data quality defines your organization? It’s kind of tricky to track Q&A and chat and all that, in multiple different locations at once. So, I think I have disabled the chat feature for Zoom so that we can try to push everything into the channel and so, anytime you would like post there and I’ll actually ask you to post something specific in there, in just a moment—that would be great.

I know we are not all in person today unfortunately. I would be great to be with you in the same room, but please avoid multitasking because you might just miss the best part. And again, the session is being recorded like all of the sessions and links to recordings will be available for registrants after the fact—not exactly sure what the lag time is on that, but if you miss this or you have to jump out for some reason, just know that should be available to pick it up where you left off.

Meet the Presenters

As I said my name is Peter Mirus, I’m a founding partner at Build Consulting. I have been working with the non-profit sector or the social good space for over 20 years and serving over 100 clients. I have three primary areas of expertise and those are information, strategy, non-profit business and operations strategy and communications. Actually, started out my career many, many moons ago in graphic design and then got into technology and data because I wanted to know how my clients in graphic design were funding information about their mission and success into their campaigns and the images that I was trying to represent for them of their mission success. So, I wanted to know how they were coming up with the statistics about their success and what drove those and whether or not they could be doing better.

Build Consulting, which is my firm, was founded on this premise. But one of the challenges is that over 50% of non-profit technology projects fail and the reason for that is that the technology moves forward, but the organization does not. We often put this visual in front of our clients. It stands old organization plus new technology equals expensive old organization and this is probably no more frequently seen than when an organization has some sort of problems with their data and they want to introduce a new piece of technology, thinking that will mystically solve all of their data quality issues. Sometimes the technology is at fault, but more often than not it can be just as much the need for change inside of the organization and understanding habits in regard to good data quality management. So, that’s one of the reasons why we underscore the transformation is critical to our client success when they are adopting new technology.

And just to show you how we think about where data is and the sort of a framework for success. When you are thinking about making improvements in data or technology at your organization, it really all comes back to leadership and governance. Do you have the right policy? Do you have the right leadership and your operational capacity? How are the quality of your processes that you are using to manage your programs and collect your data? The structure and integrity of the data itself which is what we are going to be talking about at a high level today from a data quality perspective and then the technology. The technology is intentionally last because you have to get everything right upstream of technology in order for technology to be effective.

So, here is the quick agenda for our time together today. First, I would like to get to know you a little bit. Usually when I do these kinds of sessions in person at conferences. I have everybody go around the room and say who they are. What their role is inside of their organization and what they were hoping to get out of today’s conversations? So, I can sort of weave that into the course of the information that I’m going to be presenting today. So, if you would please take a moment to right now go over into the Slack channel, hopefully you have it up and available and just introduce yourself really quick for me. Let me know again your name. First name is fine or maybe just your title at your organization would be good. And once you specifically were hoping to get out of this conversation that would be great. I do see a few familiar faces among the profile photos today. That’s great! I’m not sure if I have run up across you at a previous Good Tech Fest or at another non-profit technology related conference, but thanks for joining today. Good to see you again.

The Agenda

And while folks are doing that I’ll just run through the rest of the agenda. So, I’m going to talk a little bit about my firm’s observations in regard to data quality and the non-profit sector. And then I’m going to talk a little bit about making the case where data quality, how you build an argument for improved data quality, and then we’re going to talk about the dimensions of data quality. The way I look at it, there are five dimensions of data quality and then we’re going to talk about some ways that those apply in different types of organisational program settings. Then I’m going to finish with just a few recommendations about how to advance these kinds of conversations inside of your organization and then we’ll do some Q&A at the end. Also, anytime you have a question or would like to interject a comment in the course of this, please just run it by me on, on the channel and I’ll do my best to try to keep an eye on that as well as what I’m doing at the same time.

Just going to pause for a moment to take a look at the channel post—so Amy, Stu, Tom Stuart, I know we have run up again each other before at some point good to see you. Mitchel, Tiffany—yeah, I see a lot of different people. Some vendors and independent consultants, some program managers—that’s great—data quality analysis workers, that’s good. Sweet. I’m trying to scroll along here. Ariel, hope I’m pronouncing that right. Doug, good to see you. Mark, Alex, Alex—great. So a lot, a lot of different people from a lot of different types of organizations and different backgrounds. That’s great. And yeah, if you have any additional questions or comments about what you’re trying to get out of today’s session. That would be great. I really appreciate that. Let’s move it along here and yeah.

So, my framework exclusively with non-profits and we do a lot of work in broad information, strategy or information and technology, assessments, and roadmaps of multi-year strategies for non-profit organizations. We do a lot of work in specific software assessments and selection, so your CRM, ERP, program data management, program management, volunteer management etcetera, and we also provide some implementation support services, helping the guide clients through their implementations. We’re completely vendor agnostic and independent. We don’t work with any vendors, any formal relationships, and the sort of gives us an opportunity to engage with non-profits on their side of things and really see what their data quality challenges are. We do also offer some outsource data management services. So, we work hand in hand with our clients to try and help them with their data quality challenges. So anyways, throughout all of these different kinds of engagements, we’ve seen organizations encounter and then deal with or feel to deal with different kinds of data quality issues. The kinds of decisions that they make regarding data quality are interesting to us and to me in three different ways.

Can data quality determine success?

And the first is that data quality decisions have the ability to determine how successful an organization will be. Basically, how it sets itself up for success, mediocrity or failure based on its ability to manage and leverage data, to make informed business decisions and drive impact. I think about five years ago, I read an article about performance data management and reviewing that data to drive performance improvement in the UK. And the study indicated that about 92% of organizations in that sector collect performance data about their programs, but only 6% use that data in any formal way to evaluate and drive performance improvement. Clearly, we’d like to see many, many more organizations within the social good sectors, collect high value data about their programs and really use it to drive impact.

The second thing is that data quality decisions are profoundly, I think, revealing of the strengths and limitations of an organization. So, does the organization have the strategic vision to make appropriate investments in data quality? Does your organization know what relationships it has with other organizations and individuals? What data is profoundly important to nurturing and growing those relationships? And as I said before, it all comes back to leadership and you really learn if the organization has that leadership and the discipline to require staff do adhere to policy standards and business processes that result in good data quality.

And third, many organizations don’t really or fully comprehend the full picture of what data is. And data is all of the information that your organization collects, both qualitative and quantitative or both structured and unstructured. And another way of putting that in a more simple way is it’s about both the narrative and the numbers. And you really need to have both in order to really understand your organization’s operations. And as I said before, its program effectiveness and how to use that to drive enhanced mission impact.

Making the case for data quality

So, let’s talk a little bit about making the case for data quality. So, this picture that you see on your screen here is an aerial photo of the Washington Navy Yard in Washington DC. It was taken in 1985 and I would say superficially, it appears like somewhat prosaic, urban landscape. Folks who are local to this area in Washington, DC, I’m here in Northern Virginia, would be familiar with where this is located. And in 1980 Congress created the super fund to pay for clean-up of the country’s most hazardous waste sites. And then in 1998, the EPA put the Washington Navy yard on the national priorities list, which means that it’s one of the 1,700 prioritized clean-up projects of the 47,000 waste sites in the United States. And the Washington Navy yard site has the status, at least last time I checked, of active—meaning that the clean-up facilities had not yet been completed.

And what you see here on your screen now is a picture of the Anacostia river on which the Navy Yard is situated. And it’s part of a complex ecosystem—the river—about which much data has been collected from it and about what there is still much to be learned. Now in data quality we talk about, sometimes the real-world object and data exists to describe a real-world object. The better the quality of your data, the better you are able to understand the real-world object and then your relationship to it, how your actions impact it. And just as the river flows continuously, sometimes the real-world object is in a constant state of flux, no more so is this true than when you’re collecting data about people. They don’t stay in one place. They move around and their condition changes, and their behaviors change, and the actions in which they engage change. So, this is just to point out that if you have a poor-quality data when you’re trying to describe something using data, that’s inherently hard to describe already, you’re putting yourself in an even more difficult position.

And good quality data has been and continues to be part of the Anacostia Watershed clean-up process. And that seen participation from numerous different organizations and gathering and integrating high quality data from various sources is critical to the work of these organizations and to other organizations, both large and small participating in worldwide conservation efforts. So, whether your organization or agency is removing contaminants from the Washington Navy Yard and the Anacostia Watershed or running medical field trials in Africa or training small businesses in Afghanistan or empowering women and girls in inner cities in the United States or combating domestic violence in the United States. The information you gather about the people you serve, and their local environment has a good deal to do with the impact of your efforts. So, with so much data lacking in what we wish we could obtain—that we wish we could obtain the quality of the data that we do have to describe those real-world objects becomes of paramount importance because data quality directly helps to create life quality.

Data quality is not just about accuracy as we’ll cover in the upcoming parts of the conversation, it’s multi-dimensional. We all need to place the right information in the hands of all people in our organizations, from the C-suite to the frontline staff, to the constituents, and the stakeholders that we serve in the community and that helps create greater efficiency and impact. By doing this, how can we help people to do new and amazing things? That’s what it’s all about, right? We’re hopefully, I like to say, and this is viewed as a little bit extreme by some of my colleagues in the space, but I say that data is about love because if we’re in the social sector and ultimately our job is to love the people and things that we are in a relationship with, and trying to enhance and preserve as part of our missions—and the data is about describing those relationships and the people and things that we serve, or put ourselves in a relationship to, then it should be about love. The mission is about love and data is about love. And if that’s what it’s not—if that’s not what it’s about, then what are we here for?

And finally, I would say the data that you have enables you to have a voice to be heard, you want to get through to people. And the stories that you tell with that data creates resonance on two different fronts. It creates intellectual resonance, and it creates emotional resonance with your key audiences. It communicates the value of your work and helps to secure the funds and influence necessary to do the work well. So if you were going to go into your organization and put together an impassioned pitch for why data quality is important, these are some of the aspects that I would focus on and used similar stories like this and others that you’ll see throughout the course of this presentation to really try to hit those intellectual and emotional resonance points with different executives, decision makers, and others to really put the emphasis on good quality the data.

What data quality is

So now I’d like to transition and talk about what data quality is. And I often frame it in terms of these five dimensions for the sake of simplicity and those who are familiar with data quality terminology used by some major funders, like USAID will probably recognize some points of familiarity in this, although it’s not exactly the same. And these dimensions are first completeness and validity and accuracy, which are distinct, but well paired with each other, sorry, I’m having some presentation issues here—consistency, timeliness, and integrity. And we’re going to take each one of these in turn and talk about them. And as I said before, please feel free to post any questions or, or even comments that you have based on your own experience.


So, first we’re going to talk about completeness and the case study that we’re going to use here is gender equality. So, let’s say that your organization is running an economic development program in Afghanistan, as one of my former clients was. And let me tell you, you have to be really a mission driven to deal with the challenges of doing these kinds of projects in places like Afghanistan and Iraq and Lebanon and other challenging environments. And this particular program was focused on training small business owners and entrepreneurs to engage in the global economy. And as an aspect of the data collection, the program managers, and they call them chiefs of party, sort of on the ground program directors, we’re tasked with collecting information on women served through these programs. This was an imperative coming from both the funder and their own organization. Big focus on engaging with women and the program is executed through multiple technical assistance projects, run out at various locations within the country. And so, the data collection was being done on paper and have that entered into spreadsheets.

However, the data collection method was not standardized in the best way, let’s just say. Some projects had a field to capture the beneficiary gender, and some did not. Some of the projects that had gender field but did not consistently manage the paper documents to capture that information resulting in the loss of backup—that backup information and contributing to data gaps, or at least challenges supporting that the data that was collected was accurate. The problems within this scenario impacted both the breadth and the depth of the information. The data was not complete. And in order to successfully provide reports based on the data, the program and managers had to make best guesses to fill the information in, and this impacted the data quality and obviously also the trustworthiness of the data. And so, this is the kind of completeness problem that you want to avoid. And completeness, as I said, is about breadth and depth. It’s not—it’s about the fields of information that you’re capturing as well as there being a value in each record for that field. So, if you think about it in terms of a spreadsheet visual, you have the names of the fields going across the top, the records going down—down vertically. You want to make sure that you’ve kept—you have all of the appropriate fields, but you also want to make sure that each one is populated a complete record in other words.

So, you want to make sure that the gender field is available as an attribute of the data object you want to describe, and you want to make sure that the field has information in it. And so, this is a simple example of the problems that can be created by incomplete data. When a major funder like USA loses confidence in the data quality and trustworthiness of the data because they see problems, then they lose confidence in your ability to deliver on the program. But on the flip side, having good quality data and backup for that data causes the funder to gain confidence in the program and then increases the chances that the program will be expanded or renewed. So again for completeness, you really want to make sure that you have the breadth and depth of completeness that you need. Does the data have all of the expected attributes or field and is the data set as fully populated as expected? This completeness is very, very simple and often one of the first things that people think about for data quality. Do we have that data? That’s the—that’s the most basic thing, right?

Validity and Accuracy

And I’m going to talk a little bit about validity and accuracy. So, in the world of data management, including performance data management for programs, validity is I would say, defined as the conformity to a domain of values. And that domain of values might be a couple of different things. It might be an actual set of valid values, such as the list of states or provinces. It could be a range of values such as a number between one and 100, or it could be a rule that would generate a value, such as GPS coordinate fields being automatically populated from a street address. So in other words, validity is judged by comparing data to criteria or constraints. So, in order to test and measure validity, you need to know that value or rules to which the data should be compared. So, these are the rules that prevent invalid data from being entered, and that goes a long way to helping avoid inaccuracy. And you can extend the definition of validity to create rules for preventing or assessing duplicate data, which in data sense sets over a very small size is difficult to assess visually, right? So, you want to try to have those system-based rules in place. So for example, you can create a rule for a beneficiary database indicating that it should be impossible to have two records with the same last name and street address. You can be using a piece of invalid data, excuse me, let me start over. You can be using a piece of valid data that is nonetheless inaccurate. So, validity and accuracy are related, but they’re not the same thing. In order to examine accuracy, you must examine the real-world object and compare it to the data. So for example, if I was filling in a state field and entered MD for Maryland that would be considered a valid response because it conforms to that list of states that domain of values. But if I entered Maryland and I lived in Virginia, the data would be valid by the standard that we designated, but not accurate.

And so, you can assess the validity, as I said, sometimes automatically preferably in a rule—consistent rule-based way, but in order to really assess accuracy, you do need to actually compare the data to the real-world thing that it’s describing. And sometimes that can be more easily said than done. So, let’s tie this to a use case. So last week this man was a $10,000 per year giver donor to your organization, but this week he’s no longer your donor. Why? Well, this man’s wife was on your mailing list and some donations were made in her name and some in his, and then unfortunately this man’s wife died. The man let your organization know that his wife was deceased and a development person at your organization assessed the woman’s record and selected his wife status as inactive, deceased from the dropdown list. The dropdown list ensured that the person who was making that data change could only select from a range of values. Therefore, it was impossible for the data to be invalid and in the donor database, that information also accurately reflected the real-world situation. However, because of poor data integration between the donor database and the mailing list, the wife was never taken off the organization’s mailing list. And in that separate dataset, she continued to have a status of active. Therefore, even though the husband notified the organization several times that his wife was deceased, the organization continued to send mailings to her. The person doing that data entry into the donor database reported the problem to a manager, but it was not red flags as a potential persistent data quality issue. So as it turned out ultimately, and this is something that happened in real life, the problem took only a thousand dollars to resolve. But by the time the organization recognized and reacted to the data quality problem resulting in the inaccuracy in one part of the dataset, the man was emotionally upset and had lost confidence in the competency of the organization.

And I don’t know if you have similar experiences in your own life as being on the receiving end of constituent email or physical mailings, and you receive something from an organization that indicates that not all of their systems understand your relationship with them as an accurate way. Like for example, get offered to donate to something that you’ve already donated to, or you receive a message about being a member of a donor program or other constituent program that you’re not actually a member of, or inviting you to join something that you’ve already joined. So, this fellow reasonably got upset and as I said, lost confidence in the competency of the organization. So, he cancelled that $10,000 pledge and began giving the same amount to another organization. So, for lack of a thousand dollars data quality decision being made proactively, the organization lost $10,000. Now you might say, okay well, $10,000, maybe in the grand scheme of things for an organization that raises say $10 million dollars a year, that’s not such a huge deal. But in actual fact, what happened is the organization lost $200,000 dollars in potential revenue. That was what they expected to get from this donor based on their overall statistics during his lifetime. So that’s a thousand—a delay in making a thousand-dollar fixed decision to correct an accuracy problem. Remember, in both systems that the status for the wife’s constituent record was accurate, but in one system it was valid and in another system it was invalid. So for a lack of that, thousand-dollar decision being made proactively, it costs $200,000 dollars the fact of thing.

So that’s, that’s something that we all want to avoid, right? I wish, I wish these kinds of stories were, were more uncommon than they are, but this is actually a fairly common scenario. And it’s really interesting because organizations don’t often do the kind of follow up that they should do with lapsed donors or donors in these kinds of situations to find out what the challenge really was. And oftentimes that kind of, sort of due diligence or investigatory work can help identify data problems that you might not otherwise have spotted. Because of the, as I said, the challenges and actually physically comparing the data—the data’s accuracy to the real-world thing that it describes.


So now moving on to consistency, and this is probably one of the hardest things for organizations of all stripes to get, right. It’s just very challenging. So, data consistency is ensured by the methodology applied for data collection and that methodology remaining consistent over time. And this is measured by the degree of similarity or absence of variety within the data. And we can measure consistency of similar data over time by looking for trends and differences from the trends. And consistency is what allows for the authentic analysis or interpretation of data over time and across space. And in order to ensure that this is possible, the data collection process must be well-documented and standardized to the fullest extent possible. So for example, considering every program within a vacuum can be deadly from the standpoint of organizational indicators, that aggregated will provide performance data to both the organization and its key funders, as well as potentially to industry analysts and influencers. If the consistent collection rules for organization indicators are not applied across all of the programs and then organization’s performance may be off in ways that are difficult to spot just by looking at the data, absent the ability to evaluate the record of differing data collection methodologies, and then make appropriate corrections in the interpretation. So this is no more than to say that if you—if you—the methodology that you use for collecting and managing data over time changes, but no record of that change is made, it becomes increasingly difficult to do historic trend analysis based on that. And so, I’ll now provide an example to help this really sink in.

So in this case story, I’m going to talk about consistency and youth services. And consistency will frequently be a problem when critical business definitions change over time. But that fact of that change is not recorded as a separate data point that can be applied to, to help interpret the data trends. So for example, an organization’s definition of a risk youth might change over the course of a prolonged period of time and data might be collected relative to that definition. This means that—that means when that occurs, that there is a lack of consistency within the data collection. And then hence within the data itself, the fact of the definition changing is a critical piece of the data necessary to interpret into the lack of consistency within it and make adjustments accordingly when doing the analysis. Otherwise, it might seem as if the real-world object in this case, the number of at-risk youths had changed, but in actual fact what changed is how the organization defines “at-risk”. So, let’s just say that at-risk was related to conditions in their local environment, in their living environment and their school environment, et cetera. And that definition of what aspects of that should be taken into the environment should be taken into consideration changed, even that might influence the number of people or youth considered at risk. And you could compound that by introducing another challenge of consistency, which is what if the definition of youth changes say over a 10-year time span, as well as at risk. So you could make, you could, you can really see that if you don’t maintain a log or a record of those business definitions changing over time, you wouldn’t necessarily know looking back at the historic analysis of the data that say in 2010, the definition of at risk changed, it changed again slightly in 2012, it changed in 2018 or that say in 2013, the definition of youth expanded from 13 to 18, to 12 to 18 in age.

So this is, these are cases where consistency will kill you, a man. And all of these—many organizations that I’ve worked with do actually a fairly good job of maintaining good completeness and validity and accuracy. Or at least they’re able to remediate problems in those areas after the fact, sometimes not obviously, but sometimes yes—able to do that. However, they might not have been in the pro—in the habit of documenting when business definitions or processes change. And that then becomes their Achilles heel and trying to represent and an authentic interpretation of their organization’s performance over a period of time.


Now I’m going to talk about timeliness and what that is. So, timeliness refers to both the regularity of the data collection and the availability of that data to decision makers, which are two distinct but related things. But the overall goal is to decrease the time between them when the real-world object changes and when that change becomes known by all relevant parties. So relevant parties could be program staff, organizational strategists, funders, and other different information consumers that need to make good decisions based on that information. So, there are some different points of importance to consider regarding timeliness. And the first is if the data has a high degree of volatility, there is an increased risk that the data will not meet standards for timeliness, meaning that by the time the information becomes known in decision-making quarters, the real-world object has changed. And volatility is the degree to which data is likely to change over time. So an example of low volatility data historically, is considered to be gender, but an example of high volatility data is salary.

And of course, all of these things are relative. Yes, gender can change, but overall, it is not considered to be a high volatility data point. At least not as far as, as, as I’m aware, but data that has high volatility will need to be checked more frequently to ensure it remains timely. So yes, you have to chat both obviously, but there are different degrees of volatility within those designations. Data fails to be timely also, if there is a lag between when a fact becomes known and when it becomes available for use. So for example, if a program staff in the field becomes aware of a change to a real-world object that is important to project outcomes or program outcomes but fails to communicate it to program decision-makers in a timely manner, perhaps because the change occurred between regular reporting intervals, that might create a problem. So, this highlights the need for all staff to follow both the letter and the spirit of the law when collecting and transmitting data. And then data also feels to be timely when there is a lag between when the data is updated at the source and when it becomes available to decision makers, the other end of the information chain. So for example, if data is contained in a spreadsheet at the program office, but that’s not—does not get transmitted to HQ, then you have problems with timeliness. So, timeliness is sort of multifaceted. But again, it’s really again about making sure that the—you decrease the time between when the real-world object of the data describes changes and when that change—change becomes known by everybody that’s making decisions.

And so let me talk about that in terms of another case story. So, this is a picture of the devastation that occurred on the New Jersey coastline, a number of years ago during hurricane Sandy. Timeliness was of critical importance for this disaster response situation, as it is in all disaster response scenarios. The timeliness of accurate data such as weather forecasting, resource availability and positioning helps governments and relief organizations to successfully deliver services to assist those in urgent need because of a natural disaster, such as this. Or in another example, some sort of mass human rights violation or civil strife in the events leading up to, during and following a large-scale event like hurricane Sandy or some sort of civil conflict, the timeliness of data can make a huge difference to providing timely relief. And you will find in these particular kinds of situations that the efficiency with which the organizations involved exchange information and the timeliness of that data, being able to be combined and interpreted for different actions has a lot to do with how organizations and agencies fair and the effectiveness of their activities and how, excuse me, hence in the eyes of public perception and the assessment that takes place following the aftermath of such an incident. Obviously, this has been very relevant to all of us in recent months and related—relationship to the national discourse about the quality of data associated with the pandemic, both the data that’s known about the disease itself, as well as the data about effective remediation measures in regard to the disease on a number of different fronts, both public policy, and in terms of applying vaccines—and developing and applying vaccines and so forth. And we’ve all been treated on an almost daily basis to opinions and evaluation of whether or not that data was being exchanged and interpreted and used to inform decisions in a timely way.

Yeah, as well as a myriad of other things. So that is obviously the most present example for us globally, that impacts everyone that’s related to data quality in general, but specifically to this aspect of data quality, which is timeliness. And I’m sure you’re also, also aware that of the public perception aspect of, of that as well.

Data Integrity

So last I’m going to talk about data integrity. Data integrity is how all of the different parts fit together, how the pieces make one whole thing. So if you have a large database, a data warehouse, data mart, or even a file share with multiple Excel files containing records from a range of systems, people who are using that data have an expectation that they’ll be able to connect it. They expect the data to be integrated. This is one of the things that is more taken for granted in terms of an expectation today than it was say maybe 10 or 15 years ago, the integrity of the data is one of the first things that people will question when they cannot use the data in ways that they expect. So, it is important to have a plan for how you are joining data from different data sets or tables and how well the jointed data fits together. And so, when you have multiple information systems and the data within them relates to each other, you need to create a way to make sure that the integrity of that data is present when it is related for the purpose of making, excuse me, gaining insight and making decisions. And I just want to note here that the term integrity when we’re relating the data is used different ways in different environments. So many people, when they talk about data integrity, they’re thinking about it as being synonymous with data quality broadly or they think about data integrity—they think about whether the data has become corrupted in some way, or the databases become come corrupted. In the sense that we’re talking about it today, we’re talking about integrity in terms of almost integrability, how the different parts fits together to make one integral whole. So, at a very high-level measures of integrity are measures between different tables in the database, how you join data from different data sets and how well it fits together. It’s basically, as I said, your ability to relate the data cohesively.

And so again, I’ll use a case story as an example of integrity. So, integrity matters in healthcare industry. Again, we’ve all been thinking about this a lot lately, when you need to be able to bring together data related to diseases from a wide variety of sources. So, viewing all of the medical records from a particular hospital is relatively easy by comparison to collecting and analysing disease related data from a variety of different hospital systems or other similar institutions. Without being able to easily integrate the various data sets, it is very time consuming to be able to analyze the data, perhaps even to perform predictive analysis that will help to control the spread of a particular disease. So in this case, as in many cases, including the last one around disaster relief, there is—where there’s a high degree of criticality associated with making a timely response, the ease in which the different parts of the data fit together is really important. You’ll know, you’ll know, probably from your own experience as a, as a medical services consumer, that there’s been a lot of work done to standardize the collection of healthcare data within the United States or globally, particularly over the past decade or maybe 15 years or so. But we all know there’s still a long way to go in terms of data record integrity and being able to take your own records, sort of the portability say, and then integrate them with another record set in another location.

So, it’s a continual challenge and a continuing continual battle. And one of the reasons why that work is being done is to ensure there is integrity across multiple different systems, across multiple different means of collecting data and that eases the ways in which people can use that data to increase public health and also mitigate risk. But as you know recently because of the pandemic, there are still, you know, with the pandemic, there has emerged many challenges to the authentic interpretation of data based on the ability of different records sets in different systems in different parts of the country and the world to produce data sets that integrate well with each other and can do so rapidly to fund analysis.

Recapping successful data quality

So, we’re going to do a little bit of recap here, because I just covered a lot of information in the short period of time. Here are the aspects of data quality that we’ve covered today. So first completeness, the breadth and depth of the data, and we considered that through the lens of gender equality services, validity and accuracy as considered through the lens of donor retention case story, consistency as considered through the lens of youth services, timeliness as considered through the lens of the disaster response, and integrity as considered through the lens of healthcare. You know, there are many different examples that I could share for each one of these different dimensions of data quality and I’m sure for those who you have been in the trenches on data quality within non-profit or social good organizations and or in systems administration, which in many way this is also data, is data management and you have a lot of your own stories to share about successes and failures, and data quality. Hopefully someday, in that not too distinct future we’ll be able to meet me in person at a conference and have some drinks and swap stories about all of these different kinds of things.

Talking about data quality in your organization

And so now I’m going to talk about a couple of, sort of pragmatic recommendations for introducing or carrying forward data quality conversations at your own organizations. So first, there is no such thing as perfect data quality much to my chagrin, and there’s chagrin of many particularly in the data analysis space. For data to be perfect, it would always have to be precisely representative of the real-world object and we know that this is extremely difficult, if not impossible. The objects that we work with in the course of performing our work are often complex. And as I said before, particularly with people or environments or anything, there’s rarely such a thing as a static—completely static object that’s being described. So, there’s an inherent challenge in describing the real-world object and keeping that description accurate. So, it’s just important to remember that. And also, as I said before, data is meaningless absent the context of relationships. Your relationship to that real-world object and the relative importance of that relationship compared to other relationships that you have, dictates your data quality priorities and this is what drives you to ask the critical questions. How complete this mandate needs to be? How accurate does it need to be, et cetera, because we all have a finite number of resources that can be committed to data quality. And it’s difficult to prioritize those resources. So oftentimes with organizations, they’ll say, well, how should we prioritize what data we focus on? One of the things that I asked them just as a sort of thought-provoking question is, is it better to have a small amount of high-quality data or a large amount of low-quality data and just let them wrestle with that a little bit.

And, but ultimately the answer is what are my obligations towards the thing that I’m in a relationship with? And what did those obligations require me to track in the form of data in order to maintain the sort of nature of the relationship. And then the question is what additional data might we want or prefer to capture that satisfies the need for additional exploration, whether that’s entrepreneurism and looking for additional opportunities to do more within that relationship or to benefit the people or things on the receiving end of that relationship. So those are the kinds of questions that you need to ask in terms of trying to prioritize what parts of your data set should receive the highest priority from a data quality standpoint, relative importance of the relationship to the organization, the obligations within those relationships, even if they’re just self-imposed obligations based on your mission. And then secondarily, what additional data might you want to collect in order to be able to inform potential future ventures or introduce maybe additional program/mission-oriented benefit. And again, I just want to emphasize that the specific approach to data quality for each organization relies on the clarity and strength in organizational strategy, both at a high-level, and as it’s extended through programs and projects. Ultimately the strategy should tell you the relationships that are most critical to success and the value placed on each relationship. And then you can, again, prioritize your resources according to that value.

And in order to do that effectively, this often requires some degree of transformation at an organizational level before improvements to processes or data or technology can be effectively leveraged. And, and again, it all comes back to leadership driven transformation. If your organization doesn’t have a strong strategic plan, it doesn’t say how success against that plan will be measured, and it doesn’t provide the entire organization with that kind of clear guidance, it becomes very difficult to sustain good data quality efforts over a prolonged period of time.

And finally, organizations of any size and complexity that make data quality, a cultural endeavor typically have good data quality relative to their need for such. While organizations that do not make it a cultural endeavor will generally have poor data quality. It’s just that simple. So do it well if you’re going to do it, and if you’re going to do data quality well, do it thoroughly and make sure to celebrate successes. For all the talk about data quality in the non-profit space or the broader social sector, having truly high-quality data is relatively rare and the ability to use that data to inform it meaningful action is even rarer. So whenever your organization makes any steps for, in terms of data quality, even if it doesn’t get you all the way to where you think you need to be, take some time to celebrate that success, and help people to feel like there are milestones that can be reached, that will be meaningful that fall somewhere short of that total perfection standard or even the standards of reasonable success that are established based on your organization’s strategic priorities.

So, I just wanted to provide my contact information as I did in the Slack channel as well, to let you know that, you know, I’m happy to have additional follow-up conversations with anybody after this. If you have anything that occurs to you over the coming days or weeks that you’d like to discuss an idea to kick around, perhaps an initiative that you’re trying to get going in your own organization, regards to data quality or just questions about how to approach particular situations, I’m happy to discuss. There’s no obligation. This is just part of engaging within the community. You know, obviously I work for a consulting firm and there may be some things that we can do for you down the road, but this is primarily just about me being available to you within this context. So, you see the options that are available on screen here. Best way to get hold of me is via email.


So, we’ve got plenty of time left at the end of this session for QA—Q&A. So, if you have any questions or comments, as I said feel free to put them into the Slack channel. I’m happy to hang out and answer as many as you have. They could be questions about the content of the presentation or again, this might be an opportunity to ask specific questions about your own needs. If you don’t want me to mention the name of your organization—when you—out loud, let me know. I see somebody is typing something in here. If you don’t have any—any questions or don’t want to stick around for the Q&A, feel free to drop off as well, that’s totally fine.

Stakeholders and the importance of data quality

One of the questions that I get most commonly is how to make data quality problems, really present to stakeholders that need to be convinced of the challenge in order to be able to create and apply a solution. Storytelling is a great way to do that, as I hope I’ve been able to exhibit during today’s presentation. We’ve also found that it is very helpful to do some visualization of the data. It’s just, there’s just some way that—some ways in which you can say, well, we don’t have 23—we have 23% of our constituent records without a value in X field. But for some reason, just to that statement would not resonate with somebody as well as just a simple pie chart that shows the—that exact same statistic. Now, obviously that’s an indicator of that’s it—that’s a very easy, simple visual to produce. And there are a lot of other ways that you can visualize data, but sometimes even just really simple things like that is very helpful.

Valuing clean data within the data entry team

So, we have a question here from Georgina, I’m experiencing issues with manual data entry from an overworked team. How, how do I try to help them understand the importance of clean entry? That’s a great question.

So ultimately, I think that a clean data entry—the value of that needs to be collected connected pretty clearly to what having clean data enables your team or your organization and be able to do better accordingly. So, one of the most significant disincentives to maintain high—maintaining high quality data within an organization is individuals that are engaged in managing the data at, at all levels, whether it’s entry or making edits or maintaining data cleanliness over time, is if the organization is not clearly and consistently running reports on that data and using it to inform decisions that lead to better outcomes.

So, you know, if somebody tasks you with the goal of doing data entry and it was done being done, in addition to numerous other things that you were doing, and you had no insight into how that data was being used and what difference it really made, that would be a problem. So, you know, without Georgina, without knowing more about your specific situation that that’s something that I would recommend just making that clear. And if you don’t have the ability to do that for whatever reason, then the organization might have some other challenges that it needs to address. So yeah, I’m glad that glad that was helpful. Thanks for the question.

Challenges of leadership and data quality

Any other questions? I’m trying to think of some other questions that I will commonly receive when presenting on this topic. I think that ultimately the biggest challenges in data quality do often come back to leadership. And that’s challenging because for many organizations, it’s most difficult to address the topic at that level. And that’s why I say that sometimes in order for data quality improvements to be made, there needs to be a profound cultural change and that starts at the leadership level. And sometimes in organizations we’ve had to actually see a leadership change maybe in a President or ED role or in a COO role or in a CFO role or some other senior executive level within the organization before that change could be realized more broadly throughout the different levels of the organization.

We had one client that we worked for over a prolonged period of time, and we were able to make some minor gains in terms of data quality, but ultimately whenever there were gaps in the data, the reports would just be taken by the different programmatic teams and they would sort of fill in the missing gaps with anecdata based on their experience. There was no telling how accurate the data actually was or wasn’t based on that. And this was information that was going into board reports and to funders, and to be able to measure success internally. Ultimately, what happened is when they had had turned over at the COO position, they were able to start righting that ship, and this may be addressed as your question a little bit Georgina as well. But one of the things that that COO did, that’s actually very infrequently done at non-profit organizations is, they made effective data management a part of the job description for employees and against which would comprise a part of their employee performance evaluation that was done, I think, annually. And that was the first time that sort of timeliness and quality of data input and management over time was made a part of the actual job description and performance management for those individuals. It is difficult to actually implement something like that with clarity. And you want to do it with clarity if it’s something that people, those performances, individual performance, and team performance are going to be evaluated against. But it can be extreme, extremely beneficial, cultural, culturally for the organization, do (1.04) something like that that is also a way to very practically say to the entire organization that we are serious, this is of high importance. That made a profound change in that particular organization and I image would in other organizations that were able to pull it off effectively.

Qualities to consider for a data manager

Another question that I often get is, what qualities to look for when hiring data manager or a system administrator that is charged with some data quality management. I have a whole separate presentation about that called A Good Data Manager is Hard to Find. You can actually find that, I think, on the Social Good website from a couple years back, but it’s also available on my firm’s website. And we actually have a whole website titled Learning on the navigation, that has all the recordings of our past webinars, which that is one. They are available on YouTube, video format and many of them, particularly over the last couple of years are available in mp3 format. Because they don’t rely extensive on visuals, you can listen to them—maybe not on a commute these days, but when you’re out riding your bike or talking a walk. That is a good way to learn. And also, we have some templates and white papers there that are helpful as well. And our blog, which is in a separate section, has a number of different articles around data quality and system selections and implementations, and how to—what good leadership and tech for technology projects inside of organizations looks like and all of those things.

So, received another question about—would love get data pointers to best handle data clean up and define errors in data entry. Well, that’s a broad question. So as I said, one of the ways to find errors in data entry is to visualize the data. It kind of helps outliers to stand out a little bit more, that can be helpful establishing clear—sort of I said—as I said earlier under the validity, clear domains of values against which data should be compared to identify validity errors, spot checking data for accuracy, as I said, you know, taking some constituent records, say at random and comparing them to information that is stored elsewhere about that individual or maybe interviewing the individual to make sure that the data that you have for that individual is still accurate. You can also—so there are a variety of different ways that you can go about assessing whether the data you have is of good quality. In terms of organizing, handling data clean up, I mean, there are a variety of different tools for that. So for example, you could do address validation to see to see if your addresses are current for locations or for constituent records, you could use deduping tools to eliminate duplicates or identify probably duplicates. You can—once you identify specific data errors based on spot checking, you could go do a more in-depth, as I said—comparison for accuracy and make that into an internal structured initiative. There are a variety of different ways to organize data clean up. I think within organizations the most common point in which data clean up is done, is when an organization is trying to migrate from one system to another or preparing to and they don’t want to carry their flotsam and jetsam forward in—from the waters of their data lake, you know so to speak, into the new system. So, that’s a point where which clean up becomes a high priority. It’s true when moving between one CRM and another, sometimes between one program management system and another, sometimes even just when moving the contents of a network file share into SharePoint or Microsoft Teams for the first time. So sometimes a data—a new system selection and implementation effort can be a catalyst for helping to provide some impetus and structure for a data clean up project. So another way is, if you have a system that is a vendor provided and supported system that—that’s very—for a very specific functional area of an organization like say, human services case management. As an example, the vendor that provides that tool may have some insights to offer you in terms of how to best to measure and perform data clean up—measure data quality and perform data clean up tasks for that particular system or the particular area—functional area or expertise that is supported by that system. Sometimes less easy to do for—less easy to get that kind of insight when it comes to say, your data is stored in a set of Excel tables that you custom developed or in an Access or SQL database, so there’s a lot of different things to consider when thinking about both identifying data quality issues and how to handle the data clean up. I will say that accuracy issues are generally easiest to catch at point of entry, so if you have sort of QA process where every X number of records are checked by another person or have another set of eyes on them or as I said, if you increase the system embedded validity rules around data entry, this can all be helpful.

Well, I think we’re at time and—or close to it, and that is all the questions that I’ve received so, thank you again so much for engaging in this session today and thanks for your questions. I hope that this was valuable for you and again if you have any follow-ups that you’d like to send my way, my email address is in the channel chat. You know whether that is in a couple days or next week or in a couple months—something comes across your desk that you’d like some feedback on, feel free to reach out to me. Have a great day everyone. Thanks again. Bye.