Scanning DSL Specification

Example with All Attributes
{
  "request_id": "123",
  "processing_strategy": "after-all-scans", // default value is "default" - process image after each scan immediately.

  // --------------- Scan Settings ---------------
  "twain_cap_setting": {
    "ICAP_PIXELTYPE": "TWPT_GRAY,TWPT_RGB", // Preferrs GRAY, fall back Color; TWPT_BW
    "ICAP_XSCALING/RESET": "null", // Resets a capability
    "ICAP_XRESOLUTION": "200", // Sets the resolution
    "ICAP_YRESOLUTION": "200", // Sets the resolution
    "CAP_AUTOFEED": false, // TW_BOOL, No default; TRUE to use ADF or FALSE to use Flatbed
    "ICAP_FRAMES":  "(0, 0, 4, 6)" // Scan part of the image only
  },

  "prompt_scan_more":  true, /** Default value: false */

  "i18n" : {
      "lang": "en" /** en (default) | de | es | fr | it | ja | pt | zh | user (user's OS locale) */
  },

  // --------------- Processing Settings ---------------
  // Blank page detection/discard
  "detect_blank_pages": "false", /** Default value: false */
  "blank_page_threshold": "0.02",
  "blank_page_margin_percent": "8",
  "blank_page_policy": "keep", /** "keep" (default) or "remove" */

  // Barcode reading
  "recognize_barcodes": "false", /** Default value: false */
  "barcodes_dpi": 100, /** DPI used to recognize barcodes, default value is 100; use high values for smaller barcodes. */
  "barcodes_settings": "", /** Additional barcode settings if any */

  // Document separation
  "doc_separators": [ /** applicable for PDF and TIFF formats only. */
      "TWEI_PATCHCODE:5:DOC_SEP", /**  */
      "blank:DOC_SEP",  /** Use blank sheets to separate documents. */
      "barcode:abc*:DOC_NEW" /** Pages with barcodes starting with abc mark beginning of documents (inclusive). */
  ],

  // --------------- Output Settings ---------------
  "output_settings": [
    {
      "type": "save", // return-base64, save, upload[-thumbnail]
      "format": "pdf", // bmp, png, jpg, tif, pdf // optional, default is jpg
      "thumbnail_height": 200, // only for -thumbnail; optional, default is 200
      "save_path": "${TMP}\\${TMS}${EXT}", // only for save

      /** JPG realted */
      "jpeg_quality": "90", // optional, default is 80, only for JPG format

      /** TIFF Related */
      "tiff_compression": "G4", // optional, default is empty; only for TIFF format
      "tiff_force_single_page": "false",

      /** PDF Related */
      "pdf_force_black_white": "true", // optional, default is false; only for PDF format
      "pdfa_compliant": "false",
      "pdf_owner_password": "",
      "pdf_user_password": "",
      "pdf_text_line": "Asprise PDF/A scan by ${COMPUTERNAME}/${USERNAME} on ${DATETIME}",
      "exif": {
        "DocumentName": "PDF/A Scan",
        "UserComment": "Scanned using Asprise software"
      },

      /** Upload Related */
      "upload_after_all_done": "true", // default is true
      "upload_one_by_one": "false", // default is false
      "upload_target": {
        "url": "http://asprise.com/scan/applet/upload.php?action=dump",
        "method": "post",
        "max_retries": 2,
        "post_fields": { // Optional additional POST fields
          "provider": "Asprise"
        },
        "post_file_field_name": "asprise_scans", // Field name of of uploaded files
        "post_files": [ // Optional additional files to be uploaded
          "C:\\_tmp0.jpg"
        ],
        "cookies": "name=Asprise; domain=asprise.com", // Optional cookies to pass
        "auth": "user:pass", // Optional auth info
        "headers": [ // Optional additioanl headers
          "Referer: http://asprise.com"
        ],
        "log_file": "null", // Log HTTP operations to a file for debug purpose
        "max_operation_time": 600, // Max operation timeout in seconds.
        "to_file": "null" // Save the HTTP response to a file.
      }
    }
  ],

  // --------------- Other Return Options: image info ---------------
  "retrieve_caps": [ // caps to be retrieved for each scan
    "ICAP_PIXELTYPE",
    "ICAP_XRESOLUTION",
    "ICAP_UNITS",
    "ICAP_FRAMES"
  ],

  "retrieve_extended_image_attrs": [ // device returned extended image attributes
    "BARCODE",
    "TWEI_PATCHCODE"
  ]
}

twain_cap_setting: Scan Settings

twain_cap_setting specifies scanning settings. You may use either TWAIN constant name or the actual contant value in both attribute names and values. For example, "ICAP_PIXELTYPE": "TWPT_GRAY" is equivalent to "0x0101": "1".

Various capability setting and resetting operations are supported:

Set Single Value

Using a single value to set a capability directly. Example: "ICAP_PIXELTYPE": "TWPT_GRAY"

Set Value with Fallback

A list of values separated with comma instructs Asprise Scanning to try each value in order until setting is successful. Example: "ICAP_PIXELTYPE": "TWPT_GRAY,TWPT_RGB"

Reset a Capability

Appending the capability name with /RESET to instruct the device to reset a capability to its device default value. Example: "ICAP_XSCALING/RESET": null

You may specify capability setting attributes in any order as Asprise Scanning will intelligently coordinate capability setting properly.

Common used capability list (note that a particular scanner may not support all the capabilities list below):

CapabilityDescriptionValues
ICAP_PIXELTYPE The type of pixel data that a Source is capable of acquiring (for example, black and white, gray, RGB, etc.). Commonly known as color mode. TWPT_BW; TWPT_GRAY; TWPT_RGB
ICAP_BITDEPTH Specifies the pixel bit depths for the Current value of ICAP_PIXELTYPE. For example, 4-bit or 8-bit gray when ICAP_PIXELTYPE = TWPT_GRAY. 1; 4; 8
ICAP_XRESOLUTION The X-axis resolution measured in units of pixels per unit as defined by ICAP_UNITS. ICAP_XRESOLUTION and ICAP_YRESOLUTION should be the same. e.g., 100, 200, 300
ICAP_YRESOLUTION The Y-axis resolution measured in units of pixels per unit as defined by ICAP_UNITS. ICAP_XRESOLUTION and ICAP_YRESOLUTION should be the same. e.g., 100, 200, 300
ICAP_UNITS ICAP_UNITS determines the unit of measure for all quantities. TWUN_INCHES (default); TWUN_CENTIMETERS; TWUN_PIXELS
ICAP_FRAMES Specifies the part of the surface the the device should acquire. (Left, Top, Right, Bottom)
CAP_FEEDERENABLED Set to true to acquire images from the document feeder or false to acquire from the flatbed. true; false
CAP_AUTOFEED Specifies whether Source should automatically feed the next page from the document feeder after the each page is acquired when CAP_FEEDERENABLED is true. true; false
ICAP_BRIGHTNESS The brightness value Source should use when scanning. 0 (default) or int value.
ICAP_CONTRAST The contrast value Source should use when scanning. 0 (default) or int value.
ICAP_ROTATION How the Source can/should rotate the scanned image data prior to transfer. 0 (default); +/-360
ICAP_ORIENTATION Defines which edge of the “paper” the image’s “top” is aligned with. TWOR_PORTRAIT (default); TWOR_ROT90; TWOR_ROT180; TWOR_LANDSCAPE
CAP_INDICATORS Whether the Source should display a progress indicator during acquisition and transfer. true (default); false

prompt_scan_more: Scan Multiple Pages in a Session

Set prompt_scan_more to true to prompt the user to scan multiple pages in a session.

i18n: Internationalization and Localization

Set lang to any of the following values:

  • en English

  • de Deutsch

  • es Español

  • fr Français

  • it Italiano

  • ja 日本語

  • pt Português

  • zh 中文

  • user user’s OS locale

If you need to display the scanner’s UI, you may also consider to set the TWAIN capability CAP_LANGUAGE to a desired language (TWLG_USERLOCALE, TWLG_DAN, TWLG_DUT, TWLG_ENG, TWLG_FCF, TWLG_FIN, TWLG_FRN, TWLG_GER, TWLG_ICE, TWLG_ITN, TWLG_NOR, TWLG_POR, TWLG_SPA, TWLG_SWE, or TWLG_USA).

retrieve_caps: Image Information to be Returned

Use it when you are interested to know the metadata about the images or the actual scanning setting used when scanning the images. retrieve_caps specifies an array of capabilites should be returned for each image acquried. You may use either TWAIN constant name or the actual contant value as the array element. For example, ["ICAP_PIXELTYPE", "ICAP_XRESOLUTION"] is equivalent to ["0x0101",  "0x1118"].

You may refer to twain_cap_setting: Scan Settings for common used capabilities.

retrieve_extended_image_attrs: Extended Image Attributes to be Returned

Similar to retrieve_caps, retrieve_extended_image_attrs specifies an array of extended image attributes to be returned for each image acquried. You may use either TWAIN constant name or the actual contant value as the array element. For example, ["TWEI_PATCHCODE"] is equivalent to ["0x1212"].

Note that only high end scanners may return extended image attributes.

For your convenience, "BARCODE" will be expanded to the following list of attributes: “TWEI_BARCODECOUNT”, “TWEI_BARCODECONFIDENCE”, “TWEI_BARCODEX”, “TWEI_BARCODEY”, “TWEI_BARCODETYPE”, “TWEI_BARCODEROTATION”, “TWEI_BARCODETEXTLENGTH”, “TWEI_BARCODETEXT”.

recognize_barcodes: Barcode Recognition

You may request high end scanners to read barcodes using retrieve_extended_image_attrs. However, the rest of scanners are unable to decode barcodes. By setting recognize_barcodes to true, you instruct Asprise Scanning to recognize a wide range of barcode and QR code formats even if the scanner devices don’t support barcode recognition. The following barcode formats are supported:

  • CODE 128 (128b, 128C, 128raw)

  • EAN-8 EAN-13

  • UPC-A, UPC-E

  • code 3 of 9

  • code interleaved 2 of 5

  • QR code

  • DataMatrix

  • Aztec

  • PDF 417

detect_blank_pages: Detect/Discard Blank Pages Automatically

To detect blank pages, you can set detect_blank_pages to true. You may also tune the following paramters:

blank_page_threshold

Specifies the maximum ink coverage to be considered as blank. The default value is 0.02, meaning if the ink coverage is less than 2%, then it is considered as blank. Valid value range: 0 ~ 1.0

blank_page_margin_percent

Page margins are often prone to noise. This parameter allow you to exclude certain percentage of page on the margins. Default value is 8, meaning 8% of page width is considered as left and right margins and 8% of page height is for top and bottom margins. Valid value range: 0 ~ 100

blank_page_policy

What to do with blank page? “keep” (default) or “remove”.

doc_separators: Document Separation

Asprise scan makes batch separation easily and improves documents organization for both high end scanners as well as economic models. Even if the scanner doesn’t support automatic document separatation, you can use Asprise scan to perform auto document separation on a batch scan and save scan results in multiple PDF or TIFF files.

You may use any combination of the following as a document separator:

Patch codes

Patch codes are a set of 6 distinct barcode patterns (1, 2, 3, 4, 6 and T) that are traditionally used as document separators when scanning. There are six distinct barcode patterns (Patch 1, 2, 3, 4, 6 and T). A common use now is to use the Patch T code or the Patch 2 code as a Page (document) separator. To use patch T as the only separator, you specify "TWEI_PATCHCODE:5:DOC_SEP". To use any of the patch code as separator, you specify "TWEI_PATCHCODE:?:DOC_SEP" (? matches a single character and * matches any number of characters; DOC_SEP means the separating page should not be included in the document). Note to use this feature, you need: 1) set ICAP_PATCHCODEDETECTIONENABLED to 1 in twain_cap_setting: Scan Settings; 2) retrieve "TWEI_PATCHCODE" in retrieve_extended_image_attrs as described in retrieve_extended_image_attrs: Extended Image Attributes to be Returned.

Any other TWAIN extended image attributes (TWEI)

Many high end scanners are capable of providing other extended image attributes, for example, TWEI_BARCODETEXT returns the barcode text natively recognized by the scanner when ICAP_BARCODEDETECTIONENABLED is set to 1. For example, if there is a barcode with text doc_[x] on first pages of the documents, you can use "TWEI_BARCODETEXT:doc_*:DOC_NEW" (? matches a single character and * matches any number of characters; DOC_NEW means the separating page is included as the first page of the new document).

Barcodes

To separate documents, you can also use Asprise scan’s built in barcode recognition. Turn on barcode reading as instructed in recognize_barcodes: Barcode Recognition and specify the barcode content match for document separation, e.g. "barcode:abc*:DOC_NEW".

Blank pages

To use blank pages for document separation, you simply specify "blank:DOC_SEP".

Manual document separation using Asprise Dialog

On Asprise Dialog, the user can manually mark an image as a document separator (separating documents only, it is not part of any document) or mark it as beginning of a new document (it is the first page of the document) or unmark it.

Manual document separation on Asprise Dialog

Output Settings

For a scan session, you may specify any combination of the following output setting types to the output_settings attribute:

save

Save the scanned images to local hard disk drives or mapped network locations. Required attribute: save_path. More details are available in save, save-thumbnail

upload

Upload the scanned images to a URL. Required attribute: upload_target.

return-base64

Returns the data of the scanned images in base64 format. Required attribute: none.

return-handle

For C/C++ only: returns the handles to the scanned images. Required attribute: none.

*-thumbnail

save-thumbnail, upload-thumbnail and return-base-thumbnail performs the same operations on the thumbnails (scaled down version). thumbnail_height is optional with default value of 200 (the height of the thumbnail should be 200 pixels).

You are free to combine any number of the output settings. For example, below request instructs to upload the scanned images in PDF format to remote web server and return the base64 encoded data of the thumbnails (JPG format):

Upload PDF and Return the Thumbnails in JPG
{
   "output_settings" : [ {
     "type" : "upload",
     "format" : "pdf",
     "upload_target" : {
       "url" : "http://asprise.com/scan/applet/upload.php?action=dump"
     }
   }, {
     "type" : "return-base64-thumbnail",
     "format" : "jpg"
   } ]
}

Image Formats Supported

You use format attribute to specify the desired output format. The following list of image formats are supported:

jpg

JPEG format. This is the default format, i.e., jpg will be used if format attribute is not present.

Use jpeg_quality to specify the JPEG quality. The default value is 80.

bmp

Bitmap format.

png

Portable Network Graphics.

tif

Tagged Image File Format.

Use tiff_compression to specify the compression to be used for TIFF. Valid values are: G4, G3, LZW, RLE, ZIP and NONE (default).

pdf

Portable Document Format. The following list of attributes may be used to customize PDF output:

pdf_force_black_white: Default is false, set to true to use CCITT Group 4 Compression for ultra small file size.

pdfa_compliant: Default is false, set to true to force PDF-A/1 compliance.

pdf_text_line: Optionally, you can specify a line to text to be printed on the bottom left of the first page. Macros can be used e.g., “Scanned by ${COMPUTERNAME}/${USERNAME} on ${DATETIME}”.

pdf_owner_password, pdf_user_password: Optionally, you may specify password when pdfa_compliant is false.

exif: the following tags can be specified: DocumentName, UserComment and Copyright.

save, save-thumbnail

You use save and save-thumbnail to save the scanned images or thumbnails to local hard disk drives or mapped network locations.

save_path is a required attribute, which you use to specify the target output location. Note save_path specifies the complete target file location, not the containing folder. If you set "save_path": .\\test.jpg and there are multiple images scanned, you’ll get only the last one as previous images will be overwritten. To avoid this problem, you may use any of the macros listed below:

${TMS}

Timestamp with milliseconds, e.g, ‘2020-12-25_08-06-27.880’.

${TM}

Timestamp without milliseconds, e.g, ‘2020-12-25_08-06-27’.

${DATE}

Readable timestamp in a format similar to ISO 8601.

${I}

Image index in a scanning session.

${EXT}

Image file extension according to the image format specified using format (default is JPG).

${TMP}, ${USERPROFILE} and all other env variables

Environment variables are supported. You may define custom environment variables and use them in the path as long as your enclose the names with ${ and }.

Unsupported macros will be replaced with _.

The actual path that the image is written will be returned in the result.

upload, upload-thumbnail

You use upload and upload-thumbnail to upload the scanned images or thumbnails to a remote server. The back end can be implemented in any programming language like C, C#/VB.NET, Java, Node.js, PHP, Python, Ruby on Rails, etc. as image data are sent using standard HTTP POST.

To specify the upload destination, you need to set upload_target to an object with the following attributes:

url

The only required attribute; specifies the target URL.

max_retries

Max number of retries before giving up; the default value is 2.

post_file_field_name

The name of the POST field for image files. Default value is ‘asprise_scans’ if there is only one file or ‘asprise_scans[]’ if there are many files. You don’t need to add ‘[]’ as the system will auto fix it.

post_fields

Optionally specifies additional fields to be POSTed.

cookies

Optional cookies to be sent over along.

auth

Optionally authentication token

headers

Optionally adds an array of additional HTTP request headers

log_file

Optionally specify a local file to print logging information for debug purpose.

max_operation_time

Max HTTP operation time allowed in seconds. Default value is 1200 (20 minutes).

return-base64, return-base64-thumbnail

Use return-base64 and return-base64-thumbnail to return the data of the scanned images and thumbnails in base64 format.

Once obtained the base64 encoded data from the result, you can then decode base64 data into binary and read the image from it.