Skip to main content

Document planning

Document planning involves looking at your target narrative and organizing it into sections, subsections, and paragraphs. This helps to identify which linguistic features like sentence aggregation and subject elision can be used in your narratives.

Language design is often guided by corpora — large sets of example narratives and example data sets.

Principles for creating good narratives

It is important to be aware of general principles for creating high-quality narratives. Briefly, these principles are the following:

  • Decide on the insights you want to derive from your data, then build your narrative around those.

  • At the document level, take the document structure that is most used in your corpus of narratives, then consider:

    • What are ways you might like to vary the structure (using NLG Studio's random variation)?

    • Which edge cases will crop up and need to be dealt with in the narrative you build?

  • At the sentence level, take typical sentences from the corpus of narratives, then consider:

    • What are ways you might like to add variation within sentences?

    • How will edge cases be dealt with in a particular sentence? You will likely have to switch out some sentences with all-new ones built specifically for edge cases. (In Studio, you do this using conditionals.)

    • How to ensure correctness in your narratives? For example, based on the data you're feeding to the system, will list formation work as you intend? Will plurals be formed correctly? Will name references appear correctly in your narratives, sounding not robotic but human?

  • Aim for quality in content, structure, and expression:

    • Good content. What does the user most want to know from the report? This is the key insight. What other insights are most important to report on?

    • Good structure. The key insight is often the best summary of the narrative. Put this first.

    • Good expression. Consider whether your narrative provides enough specificity. If the report is about a portfolio, what kind of portfolio? Some users may have more than one. How precise does the expression need to be? For example, if you are reporting on a decrease in value, and the narrative produces this decrease as 0%, you need more precision (the drop was actually 0.3%). What about repetition? If the narrative you generate repeats phrasing when unnecessary for human understanding, it will sound robotic.

You can find more guidance about this in the Quick Start in-app help on your Studio dashboard (in the Help tab, scroll to view the NLG Methodology Video Library).

Methodology video library.png

Think long-term

Many Studio projects are not one-off exercises, but rather part of a collection of related projects that grow and evolve over time. In such contexts, it is useful to identify phrases, sentences, or paragraphs that occur in multiple projects (or in several places in one project) and create scripts or functions for these constituents. This may be more work the first time you do it, but it can save a lot of time in the long term. It can also make projects easier to maintain, especially if the code expected to change frequently is kept in a different script from the code that is less likely to change.

Tip

For easy re-use, identify phrases, sentences, or paragraphs that occur in multiple projects or places, and create scripts or functions for these constituents.

Document planning helps identify these key phrases, sentences, and scripts to implement in Studio.

Tip

Keep code that is expected to change frequently in a different script from the code that is less likely to change.

In the example below from the accelerator, the script called driversOffsetsPhrase1 is one that contains code that is likely to change as the project is being refined. It's also worthwhile to note that this script is in a folder called narrative, which separates the scripts that generate text from the ones that perform analytics (in a different folder called analytics).

driversOffsetsPhrase.png