Accurate Real-time Receipt OCR

Instantly detects, recognizes and extracts structured data on receipts

Featured Clients

Sectors: FinanceInformation TechnoloyGovernmentHealthcareIndustriesEducation(show all)

Blog » Python Receipt OCR Tutorial with Code Example

In this article, we'll walk through how you can perform receipt OCR using Python in 10 mins or less.

Introduction to Python Receipt OCR

Receipts contain useful transaction information and most receipts are on paper or in raw digital formats like scanned PDF or image files. Extracting/recognizing data like merchant info, line items and amounts from scanned receipts using Python has now been simplified thanks to the receipt digitization or automated receipt processing via OCR.

Python Receipt OCR in Practice

In this tutorial, we'll use the image on the right as the sample input. Receipt OCR doesn't only recognize receipts in English. In fact, you can use receipts from any country in any language.

Open your favorite Python editor, you may copy the code snippet from the below and modify accordingly to suit your needs. Alternatively, you may clone the entire complete example program from the open source hosting website Github: github.com/Asprise/receipt-ocr

curl -X POST -F "file=@US-1.jpg" https://ocr.asprise.com/api/v1/receipt
// View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/csharp-vb-net-receipt-ocr
string response = httpPost("https://ocr.asprise.com/api/v1/receipt", // Receipt OCR API endpoint
new NameValueCollection()
    {
        {"api_key", "TEST"}, // Use 'TEST' for testing purpose
        {"recognizer", "auto"}, // can be 'US', 'CA', 'JP', 'SG' or 'auto'
        {"ref_no", "ocr_dot_net_123"} // optional caller provided ref code
    },
    new NameValueCollection() {{"file", "../../US-1.jpg"}} // Modify it to use your own file
);
Console.WriteLine(response); // Result in JSON
// View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/java-receipt-ocr
/**
 * Uploads an image for receipt OCR and gets the result in JSON.
 * Required dependencies: org.apache.httpcomponents:httpclient:4.5.13 and org.apache.httpcomponents:httpmime:4.5.13
 */
public class JavaReceiptOcr {

   public static void main(String[] args) throws Exception {
      String receiptOcrEndpoint = "https://ocr.asprise.com/api/v1/receipt"; // Receipt OCR API endpoint
      File imageFile = new File("US-1.jpg");

      System.out.println("=== Java Receipt OCR ===");

      try (CloseableHttpClient client = HttpClients.createDefault()) {
         HttpPost post = new HttpPost(receiptOcrEndpoint);
         post.setEntity(MultipartEntityBuilder.create()
            .addTextBody("api_key", "TEST")       // Use 'TEST' for testing purpose
            .addTextBody("recognizer", "auto")       // can be 'US', 'CA', 'JP', 'SG' or 'auto'
            .addTextBody("ref_no", "ocr_java_123'") // optional caller provided ref code
            .addPart("file", new FileBody(imageFile))    // the image file
            .build());

         try (CloseableHttpResponse response = client.execute(post)) {
            System.out.println(EntityUtils.toString(response.getEntity())); // Receipt OCR result in JSON
         }
      }
   }
}
// View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/javascript-nodejs-receipt-ocr
console.log("=== JavaScript/Node.js Receipt OCR ===");

var receiptOcrEndpoint = 'https://ocr.asprise.com/api/v1/receipt';
var imageFile = 'US-1.jpg'; // Modify it to use your own file

var fs = require('fs');
var request = require('request');
request.post({
  url: receiptOcrEndpoint,
  formData: {
    api_key: 'TEST',        // Use 'TEST' for testing purpose
    recognizer: 'auto',        // can be 'US', 'CA', 'JP', 'SG' or 'auto'
    ref_no: 'ocr_nodejs_123', // optional caller provided ref code
    file: fs.createReadStream(imageFile) // the image file
  },
}, function(error, response, body) {
  if(error) {
    console.error(error);
  }
  console.log(body); // Receipt OCR result in JSON
});
<?php  // View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/php-receipt-ocr

function receiptOcr($imageFile) {
  $receiptOcrEndpoint = 'https://ocr.asprise.com/api/v1/receipt'; //

  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $receiptOcrEndpoint);
  curl_setopt($ch, CURLOPT_POST, true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

  curl_setopt($ch, CURLOPT_POSTFIELDS, array(
   'api_key' => 'TEST',      // Use 'TEST' for testing purpose
    'recognizer' => 'auto',     // can be 'US', 'CA', 'JP', 'SG' or 'auto'
    'ref_no' => 'ocr_php_123',  // optional caller provided ref code
    'file' => curl_file_create($imageFile) // the image file
  ));

  $result = curl_exec($ch);
  if(curl_errno($ch)){
      throw new Exception(curl_error($ch));
  }

  echo $result; // result in JSON
}

print("=== Java Receipt OCR ===\n");
receiptOcr('US-1.jpg'); // Modify it to use your own file
# View complete code at: https://github.com/Asprise/receipt-ocr/tree/main/python-receipt-ocr
import requests

print("=== Python Receipt OCR ===")

receiptOcrEndpoint = 'https://ocr.asprise.com/api/v1/receipt' # Receipt OCR API endpoint
imageFile = "US-1.jpg" # // Modify it to use your own file
r = requests.post(receiptOcrEndpoint, data = { \
  'api_key': 'TEST',        # Use 'TEST' for testing purpose \
  'recognizer': 'auto',       # can be 'US', 'CA', 'JP', 'SG' or 'auto' \
  'ref_no': 'ocr_python_123', # optional caller provided ref code \
  }, \
  files = {"file": open(imageFile, "rb")})

print(r.text) # result in JSON

The complete source code of the Receipt OCR in C#, Java, JavaScript, PHP and Python can be found at github.com/Asprise/receipt-ocr

If you are not ready to write code now, you may try the 🧾 web based free receipt OCR capture/recognition.

Python Receipt OCR Result

Execute the code, you'll get the result in JSON. Note the result JSON contains both structured data like merchant name, address, phone, VAT/GST tax registration number, receipt number, country, currency, subtotal, total amounts and line items as well full text OCR.

Note the result JSON contains both structured data like merchant name, address, phone, VAT/GST tax registration number, receipt number, country, currency, subtotal, total amounts and line items as well full text OCR.

Beyond Python Receipt OCR

Besides Python, many other programming languages are supported: C# VB.NET Receipt OCR Tutorial, Java Receipt OCR Tutorial, JavaScript JS Receipt OCR Tutorial, PHP Receipt OCR Tutorial.

OCR your own receipts - No registration required Developer's Guide - easy integration