Visualizing metadata

VISUALIZING META DATA IN MWATER
Metadata is information collected alongside your survey responses, in addition to the actual answers provided. It serves as a valuable tool for analyzing participation data, improving response rates during live projects, and grouping your data for visualization. Metadata is automatically generated for each survey, capturing details such as deployment, enumerator information, survey status, submission dates, last modification dates, and more. In the subsequent text, we will explore these types of metadata in detail.

This information can be used to identify quality control issues, detect trends for data analysis, and allow users to filter specific portions of their data. For example, based on metadata specifications, one could filter table or map visualizations to display only the information submitted within a specific time as shown below.

Example on how to use 'Submitted on' metadata information
Example on how to use 'Submitted on' metadata information
Survey Metadata structure in mWater. This varies depending on the survey design
Survey Metadata structure in mWater. This varies depending on the survey design
Metadata provides a structured summary of basic information about data, making it easier to find, use, and reuse specific data instances. Often described as "data about data," metadata serves as a reference framework that helps organize, sort, and identify attributes of the information it describes.
It is automatically generated whenever a survey, file, or information asset is created, modified, or deleted. For administrators and managers, metadata offers a comprehensive audit trail, enabling them to track every action performed on a particular survey. This ensures efficient data management, quality control, and enhanced decision-making processes
Metadata Components
Deployment
This is one instance of a form being sent to users to fill out. It refers to the specific project or activity under which the data was collected. This provides context for the data, linking it to a specific survey or intervention.

One survey can have multiple deployments. This is useful for filtering and organizing responses by the project.

The GIF below illustrates two ways to use deployment metadata to organize your Data.

How to set up filters and quickfilters using metadata
How to set up filters and quickfilters using metadata
Filters and quickfilters are powerful tools for visualizing data based on deployment metadata. They allow you to display subsets of any data set in dashboards, consoles, maps, and data grids, tailoring the visualization to your specific needs. Filters work by defining statements that determine which records are displayed—if a statement evaluates as True, the corresponding record will be shown.

QuickFilters are particularly user-friendly, as they are displayed directly on dashboards and accessible to all users. These filters apply to the entire dashboard, enabling broad or specific views of the data as needed. For instance, deployment metadata filters can be used to visualize submissions linked to specific projects or locations.

The same filtering logic can be applied to other metadata types, such as enumerator names or submission dates. This allows you to isolate key details like who collected the data or when submissions occurred, enhancing your ability to analyze patterns or trends.

The graph on the side illustrates how deployment metadata can be used to visualize the number of submissions across different deployments (in this case, named after regions), offering a clear picture of data collection activity in the different regions.

Enumerator
The person or team responsible for collecting data, often referred to as the enumerator, is a crucial aspect of metadata. Metadata related to enumerators may include their username, email address, or other identifying details, as shown in the attached image. This information is instrumental in tracking who gathered the data, ensuring accountability, and providing insights into enumerator .

With enumerator metadata, you can run queries to analyze data based on specific enumerators, track their performance, monitor work rates, and assess the quality of their submissions. Filters and quickfilters can be applied to view data segmented by enumerator usernames, allowing for more targeted data management. For instance, you can perform data cleaning on an enumerator-by-enumerator basis to ensure the integrity of the dataset.

One key advantage of tracking enumerator metadata is the ability to identify patterns in data collection. For example, you can easily spot enumerators who consistently submit incorrect data and focus on improving their accuracy. Similarly, you can streamline the approval process by prioritizing data from enumerators who consistently meet high-quality standards.

Additionally, you can incorporate calculations within the survey to track the number of completed responses or flag incomplete surveys on an enumerator basis.


Categories of enumerator metadata
Categories of enumerator metadata
Pivot table used to perfomr Data quality checks on Enumerator Basis
Pivot table used to perfomr Data quality checks on Enumerator Basis
Status
Status in metadata indicates the current state of the data, such as whether it is still in progress, submitted, finalized, or pending. Monitoring the status of data is essential for identifying incomplete or pending responses. This helps ensure that you are visualizing only the data that is relevant and finalized, which is crucial for accurate analysis.

It is highly recommended to select finalized data for visualization, as this ensures you are working with clean, verified, and complete data. When choosing your data source, make sure to toggle the option to display only finalized responses. However, for data collection and cleaning processes, it’s acceptable to leave this option unchecked for surveys that are still pending, as this allows for ongoing data entry and corrections.

Another alternative is to use filters to select the appropriate status directly from the metadata section, much like how we applied filters for deployment or submission dates in earlier examples. This approach allows for greater flexibility and precision when working with different stages of data

Response Code
An image illustrating how to toggle on and off the  data source options
An image illustrating how to toggle on and off the data source options
A unique identifier is assigned to each survey response, ensuring that every response can be traced back individually. This is critical for data audits, error tracking, and ensuring accountability in the data collection process. The unique identifier becomes especially important during the data export and import  stages, particularly for surveys that lack a site question as a unique identifier.

The response code  serves as a reference point, allowing data to be related back to specific surveys and tracked within the mWater database. If a user reports missing data, the response code can be used to locate the relevant information and resolve any discrepancies.

It's important to note that mWater does not lose user data. The system incorporates multiple backup levels to ensure that all information is securely archived. Once the data is synchronized, it can always be retrieved, and the response code remains a powerful tool for tracking responses and maintaining data integrity.

Drafted On, Submitted On, Last Modified On, Last Modified By

The metadata fields Drafted On,  Submitted On , Last Modified On , and Last Modified By  provide information for tracking and managing survey responses. Drafted On  indicates when data collection began, and Submitted On  shows when the response was finalized and submitted, helping assess the timeline of data collection. You can track the time a user takes to complete the survey using these two parameters in Calculations. In the image above, you can see how we calculated the minutes spent completing each survey response.

Illustration on how to  us expressions to calculate time intervals using metadata
Illustration on how to us expressions to calculate time intervals using metadata
Last Modified On;  records the most recent date and time the response was edited, ensuring transparency in tracking any changes made post-submission. Last Modified By  identifies the user who last edited the response, providing accountability and insight into who made changes to the data after its initial submission. Collectively, these fields help monitor the progress, timeliness, and integrity of the data collection process.

Approval Tracking:

To effectively monitor the approval process and ensure transparency, the following metadata fields provide valuable insights into each stage of the approval process:

Approval Level; Tracks the stage of review or approval the response has gone through. This It facilitates multi-level validation of data, ensuring accuracy and adherence to quality standards. This ensures that the data undergoes necessary checks before being finalized. You can have more than 1 level of approval

Approval Level 1 By; Identifies the individual or team responsible for approving the response at the first approval level. It highlights the workflow hierarchy in data validation, allowing you to identify who is responsible for each process step.

Approval Level 1 On: The date and time when the response was approved at the first level. Provides a documented timeline of approval processes, enabling you to track and report on the approval timelines for each response.

Best Practices and Recommendations:

Stage 1 Approvers: It’s important to ensure that users designated as stage 1 approvers  are not admins of the entire survey.

Minimize Admins: We recommend keeping the number of admins for any survey to a minimum to avoid potential issues with data access and security.

Using User Activities to Track Approval:

The User Activities  data source in the database tracks actions such as entity creation, updates, and deletions, as well as response changes (created, updated, deleted). It stores the "before" and "after" status of the response and can be accessed in the schema under the User Activities  table under Advanced section data sources

By filtering this data, you can determine:

A recent update  we made allows you to pull approver data from the survey metadata, itself which includes:

Rejection.

The Rejection Message provides feedback when a response is rejected during validation, helping enumerators understand why data was rejected and what corrections are needed. The Number of Rejections tracks how often a response has been rejected, Frequent rejections can highlight areas needing improvement in data collection or review procedures. The Number of Edits records the total number of times a response was edited outside of normal submissions. Frequent edits may signal quality concerns that need addressing, either in data input or review. Additionally, if a survey is finalized and needs to be moved back to a previous status (such as Pending or Draft), it can be rejected, and the user resubmits it.

 IP Address

Captures the location from which the response was submitted, based on the IP address. Provides additional context about the geographic origin of the data.