Use Case 2 - Ditto Marks on Handwritten Medical Documents

Jul 20, 2023 11:53:18 AM

Data extraction solutions have advanced significantly in the last 2 to 3 years, allowing us to do things today that were impossible just a few months ago. One of the most notable improvements is the ability of these recent solutions to understand the context of a document.

If you missed this mini-series introduction, you can find it here: Mini-series: the real power of new data extraction solutions.. In the previous article, we discussed how to handle date formats, which you can access here: Smart Date Formatting.

This second article will explore another real-world use case: ditto marks.

What are Ditto marks?

You may be wondering what a ditto mark is, but chances are, you already know it, not by that name. A ditto mark is a double quote character that indicates that the word or value above it should be repeated. It is commonly used in handwritten documents where the precious “copy-paste” does not exist.

Let's look at an example. When listing items o n a document, if they share the same values, a ditto mark is used instead of repeating them. This helps improve efficiency and saves time.

Name                         Role                  Department

Olivia Anderson        Director            Accounting

Ethan Roberts               "                     Human Resources

Ava Johnson              Manager                   "

Liam Adams                  "                      Engineering

Sophia Campbell          "                      Sales


Real-world Use Case: Medical Practitioners Documents

General Practitioners (GPs), for example, are heavy users of ditto marks.

In Belgium, GPs and other first-line practitioners must submit a specific document to the medical insurance each time they see a patient. These documents list all the care provided to the patient along with their associated dates. While the electronic generation of these documents is becoming more common, many GPs still prefer to write them by hand. Knowing how many of these documents GPs write daily makes you understand why they love ditto marks.

Here's an example of how a part of the document listing provided care might look like:


01-05-2023           Regular Consultation

01-06-2023                     "

       "                       Specific Care


Note: in actual cases, the provided care is not described in full English; official codes are used to save time and minimise errors. For the sake of simplicity, fake full English names are used.

The Complexity of Handling Ditto Marks for Automatic Data Extraction

Automatically extracting such data was almost impossible due to the complexity of handling ditto marks. For a human, it's easy. You just need to grab the data above the ditto mark, right? For a computing program, it is much more complex. Traditional OCRs have no or limited sense of what is above or below since they simply output a continuous sequence of characters. Moreover, a document can be scanned horizontally, vertically, or anywhere in between. Again, extra preparation work or post-processing analysis would be required to handle this information correctly. Consequently, extracted data often required manual verification by the insurance company.

Training Models to Understand Ditto Marks

Hopefully, recent data extraction models can be trained to understand how ditto marks work and where to find the information.

For this example, our solution will automatically extract three cares:


[{'2023-05-01': 'Regular Consultation'},

{'2023-06-01': 'Regular Consultation'},

{'2023-06-01': 'Specific Care'}]


By correctly interpreting ditto marks, insurance companies processing large volumes of these documents can significantly reduce errors and avoid the need for manual verification. This, in turn, leads to quicker reimbursement for patients.

If you are also processing handwritten documents, you could use these new models to automate your data extraction processes. Our team will be glad to help you in your digitalisation efforts. Don't hesitate to contact us.

Following Article: Handling Repetitions in Handwritten Documents

These new models can do even more. In our next article, we will explore how they can handle repetitions in handwritten documents: Use Case 3 - Repetitions Handling.

Subscribe to Blog Updates