Practical Guide to Data Formatting and Comparison

Last updated: February 2025 Β· 11 min read

What you will learn

  • Why clean, formatted data matters in QA workflows
  • How to use JSON Beautifier and XML Beautifier effectively
  • How to compare JSON payloads with the JSON Diff tool
  • When to pretty-print vs. when to minify
  • Strategies for working with large payloads
  • How to integrate formatted data into documentation and tickets
  • How to build a data quality workflow for your team

Why Clean Data Matters in QA

In ad tech QA, you spend a significant portion of your time reading data β€” API responses from ad servers, bid request and response payloads, VAST XML tags, configuration files, event logs, and debugging output. This data almost never arrives in a readable format. Ad servers return minified JSON to reduce bandwidth. XML payloads come as single lines with no indentation. Log files concatenate multiple JSON objects without spacing. The raw data is technically correct, but it is unreadable to a human scanning for specific values.

Unformatted data is not just inconvenient β€” it is error-prone. When a QA engineer scans a minified JSON payload for a specific field, the likelihood of missing a value, misreading a nested structure, or overlooking a subtle difference between two similar payloads increases dramatically. Studies in software engineering consistently show that code readability correlates with defect detection rates, and the same principle applies to data readability. Formatted, indented data makes structural relationships visible at a glance, allowing you to spot anomalies quickly and accurately.

Clean data also improves communication. When you file a bug report, paste a payload into a Slack message, or include data in documentation, formatting determines whether your audience can understand what you are showing them. A minified JSON blob attached to a Jira ticket is effectively useless to the developer who needs to reproduce the issue. A properly formatted, syntax-highlighted payload communicates the problem instantly.

Formatting JSON with JSON Beautifier

The JSON Beautifier takes any valid JSON input β€” minified, partially formatted, or inconsistently indented β€” and produces cleanly formatted output with consistent indentation, proper line breaks, and aligned structure. The tool parses the JSON into an internal representation and then serializes it back with formatting rules applied, ensuring that the output is both valid JSON and easy to read.

To use it, paste your raw JSON into the input panel and click format. The tool instantly produces indented output where each key-value pair occupies its own line, nested objects and arrays are indented one level deeper than their parent, and closing brackets align with their opening counterparts. This visual structure makes it easy to trace the hierarchy of a complex payload β€” you can see at a glance which fields belong to which object and where arrays begin and end.

If the input is not valid JSON, the beautifier reports the parsing error with a description and the approximate position of the problem. Common causes include trailing commas after the last element in an array or object (valid in JavaScript but not in JSON), single-quoted strings (JSON requires double quotes), unquoted keys, and missing commas between elements. The error message helps you locate and fix the issue before re-formatting.

Practical JSON Formatting Tips

  • Format before reading. Always format a JSON payload before trying to interpret it. Even experienced engineers miss nested structures in minified JSON. The 5 seconds spent formatting saves minutes of squinting.
  • Use formatting to validate. If the beautifier throws a parse error, the data is invalid. This is a useful first-pass validation β€” before checking business logic, confirm that the data is structurally sound.
  • Compare formatted output. When comparing two payloads, format both first with the same settings. Inconsistent formatting creates false differences that obscure actual changes.

Formatting XML with XML Beautifier

The XML Beautifier does for XML what JSON Beautifier does for JSON. It takes any valid XML document β€” minified, single-line, or inconsistently formatted β€” and produces properly indented output where each element, attribute, and text node is clearly delineated. In ad tech, the primary XML format you encounter is VAST (Video Ad Serving Template), but XML also appears in VMAP playlists, MRAID creative tags, and various publisher configuration formats.

XML formatting is particularly valuable for VAST inspection. A typical VAST response contains nested elements for ad definitions, creative assets, tracking events, and companion ads. In minified form, these elements blur together into an unreadable stream of angle brackets. Formatted, the hierarchical structure becomes visible: you can quickly identify the <Ad> container, find the <MediaFile> URLs, trace the <TrackingEvents> list, and verify that all required elements are present.

Like the JSON tool, the XML Beautifier validates the input as part of the formatting process. If the XML is malformed β€” missing closing tags, mismatched element names, invalid characters, or incorrect namespace declarations β€” the tool reports the error. This is especially helpful for VAST debugging, where a single malformed element can prevent the entire ad from loading in a video player.

Comparing JSON with JSON Diff

The JSON Diff tool compares two JSON documents and highlights exactly what changed between them. It performs a structural comparison β€” not a text comparison β€” which means it identifies added keys, removed keys, and changed values regardless of formatting differences, key ordering, or whitespace. This makes it far more useful than a generic text diff tool for JSON data.

The typical workflow is to paste the "before" JSON in the left panel and the "after" JSON in the right panel, then click Compare. The tool displays the differences inline, with added content highlighted in green, removed content in red, and changed values shown side by side. Unchanged sections are collapsed or dimmed so that the differences stand out visually.

In ad tech QA, JSON Diff is essential for several workflows. When a bid request configuration changes between versions, you can diff the two payloads to see exactly which fields were modified β€” was a new targeting parameter added? Was the floor price changed? Did the site domain shift? When an ad server returns an unexpected response, you can diff it against a known-good response to isolate the deviation. When testing API changes in staging vs. production, you can diff the responses to verify that the staging changes produce exactly the expected differences and nothing more.

Structural vs. Text Comparison

A generic text diff tool (like Unix diff) compares files line by line. This means that reordering keys in a JSON object, changing indentation, or reformatting arrays will produce dozens of false-positive differences even though the data is semantically identical. JSON Diff avoids this by parsing both inputs into their structural representation before comparing. Two JSON objects with the same keys and values in different orders are reported as identical. This structural awareness is what makes JSON Diff reliable for data comparison, while text diff tools are better suited for source code and configuration files where formatting is part of the specification.

Choosing Between Pretty-Print and Minify

Pretty-printing (formatting with indentation and line breaks) and minifying (removing all unnecessary whitespace) serve different purposes, and choosing the right one depends on context.

Pretty-print when: you are reading data for understanding or debugging, including data in documentation or tickets, sharing payloads in messages or emails for human review, or storing configuration files that will be version-controlled (formatted files produce cleaner diffs in Git). Readability is the priority in these contexts, and the additional bytes from whitespace are negligible.

Minify when: you are constructing data for transmission over a network (API requests, ad server responses), storing data in size-constrained environments (cookies, URL parameters, localStorage), or generating test fixtures where file size matters. Minification removes whitespace and line breaks without changing the data content, reducing payload size by 10-30 percent depending on the structure's depth and key/value lengths.

A common mistake is to minify data before comparing it. Because minification removes formatting, a text diff between two minified JSON objects produces a single, unreadable difference spanning the entire file. Always format both inputs consistently before comparing β€” the JSON Diff tool handles this automatically by parsing structurally, but if you are using external diff tools, format first.

Working with Large Payloads

Ad tech payloads can be substantial. A full OpenRTB bid request with dozens of impression objects, detailed device data, user segments, and site metadata can easily reach tens of thousands of characters. VAST wrappers that chain through multiple ad servers accumulate tracking URLs, companion ads, and extensions that produce XML documents hundreds of kilobytes in size. Working with these payloads requires deliberate strategies to stay efficient.

Format first, then search. Once you have formatted the payload, use your browser's search function (Ctrl+F or Cmd+F) to jump directly to the field you need rather than scrolling through the entire document. Searching for a key name in formatted JSON is fast because each key is on its own line.

Extract relevant sections. If you only need to examine a specific part of a large payload (for example, the imp array in a bid request or the <Creatives> section in a VAST tag), copy just that section, format it independently, and work with the smaller extract. This reduces visual noise and focusing effort.

Use diff for large changes. When comparing two versions of a large payload, do not try to visually spot differences by reading both. Paste them into JSON Diff and let the tool identify changes. Even a single changed value in a 10,000-character payload will be highlighted instantly, whereas a human scan might take minutes and still miss it.

Integrating Formatted Data into Documentation and Tickets

Formatted data dramatically improves the quality of bug reports, technical documentation, and investigation tickets. When filing a bug report, include the formatted payload that demonstrates the issue, with the problematic field highlighted or annotated. This allows the developer to reproduce the issue immediately without asking for clarification about which field is wrong or what the expected value should be.

For documentation, formatted examples serve as reference specifications. A formatted JSON example of a valid bid request, annotated with descriptions of each field, is more valuable than a paragraph of text describing the same structure. Formatted XML examples of VAST tags show the exact element hierarchy that a video player expects.

When including formatted data in tickets, wrap it in a code block or use a collapsible section if the platform supports it. Most project management tools (Jira, Linear, GitHub Issues) support markdown code fences with language tags (json, xml) that add syntax highlighting. This makes the data scannable even within the context of a busy ticket thread.

Building a Data Quality Workflow for Teams

Individual formatting is useful, but the real value comes from establishing consistent data quality practices across your QA team. A data quality workflow ensures that every team member formats, compares, and documents data in the same way, making collaboration smoother and knowledge transfer easier.

  1. Standardize tools. Agree on a common set of formatting and comparison tools so that everyone produces consistent output. The JSON Beautifier, XML Beautifier, and JSON Diff tools in this toolkit provide a shared baseline. When everyone uses the same tools, formatted outputs are directly comparable.
  2. Format before sharing. Establish a team norm that any data shared in Slack, email, or tickets must be formatted. Unformatted data should be treated like an unfinished message β€” not ready for consumption. This simple rule eliminates a significant source of miscommunication.
  3. Diff before releasing. Before deploying configuration changes or API updates, diff the old and new payloads and include the diff output in the release notes or pull request. This gives reviewers a clear picture of what changed and provides an audit trail for future reference.
  4. Archive reference payloads. Maintain a shared repository or wiki page with formatted examples of key payloads: a valid bid request, a correct VAST response, a properly configured consent string. When new team members join or when a production issue arises, these reference payloads provide a known-good baseline for comparison.
  5. Validate proactively. Use the formatting tools as a first-pass validation step in your QA workflow. Before testing business logic, confirm that the data is valid JSON or well-formed XML. Catching structural errors early prevents wasted time debugging logic issues that are actually caused by malformed data.

Tips for Maintaining Readable Data

Beyond using the formatting tools, several practices help keep data readable throughout its lifecycle:

  • Use meaningful key names. When designing APIs or configuration formats, choose descriptive key names that make the data self-documenting. A key called adSlotWidth is immediately understandable; a key called w requires context.
  • Keep nesting shallow. Deeply nested structures are harder to read and navigate, even when formatted. If a JSON object is nested more than 4–5 levels deep, consider whether the schema can be flattened or broken into separate documents.
  • Comment configuration files. For formats that support comments (YAML, many XML variants, JSON5 in development contexts), add comments explaining non-obvious values. For standard JSON, which does not support comments, maintain a companion documentation file.
  • Version your schemas. Include a version field in your data formats so that readers can identify which version of the schema a payload conforms to. This is especially important when formats evolve over time β€” a missing field might be an error or might simply be absent from an older schema version.
  • Trim before formatting. Before formatting, remove any surrounding whitespace, trailing commas, or non-JSON wrapping (such as variable assignments like var data = {...};) that would cause the parser to fail. Clean input produces clean output.

Related Resources