Asprise Scanning Request DSL¶
Over the years, the developer community have provided numerous feedbacks to us. We are trying hard to simplify your job. In addition to traditional API, this release allows you to use a JSON based domain-specific language to perform complex scan settings and powerful output specification.
Introduction to Scanning Domain-Specific Language¶
To implement a typical scan function, usually you’ll call APIs to set the scan parameters like color mode, paper size, then call APIs to acquire the images and make another sets of API calls to do post processing like converting to PDF and uploading. Asprise Scanning DSL eliminates most of the API calls and allow you to simply specify your intend in a readable and maintainable format.
Hello, World: Scan DSL¶
{
"twain_cap_setting" : {
"ICAP_PIXELTYPE" : "TWPT_RGB", // Color
"ICAP_XRESOLUTION" : "100", // DPI: 100
"ICAP_YRESOLUTION" : "100",
"ICAP_SUPPORTEDSIZES" : "TWSS_USLETTER" // Paper size: TWSS_USLETTER, TWSS_A4, ...
},
"output_settings" : [ {
"type" : "save",
"format" : "pdf",
"save_path" : "${TMP}\\${TMS}${EXT}" // Can be absolute path or path containing variables
} ]
}
You can then pass the scan request in JSON to one of the methods accepting it. For example:
String scanRequestInJson = "{ ... }"; // scan request in DSL
Imaging imaging = new com.asprise.imaging.core.Imaging(null);
Result r = imaging.scan(scanRequestInJson, "select", true, true);
// Alternatively, use the user-friendly Asprise Scan UI Dialog
r = new AspriseScanUI().setRequest(Request.fromJson(scanRequestInJson))
.showDialog(null, "Dialog Title", false, null);
Make Use of DSL’s Flexibility¶
The scan request is in plain string, so that you can externalize it - for example, load it from an external file, url or database.
You may also consider setting up a number of scanning profiles to allow the user to perform personalized tasks.
Scanning DSL Specification¶
{
"request_id": "123",
"processing_strategy": "after-all-scans", // default value is "default" - process image after each scan immediately.
// --------------- Scan Settings ---------------
"twain_cap_setting": {
"ICAP_PIXELTYPE": "TWPT_GRAY,TWPT_RGB", // Preferrs GRAY, fall back Color; TWPT_BW
"ICAP_XSCALING/RESET": "null", // Resets a capability
"ICAP_XRESOLUTION": "200", // Sets the resolution
"ICAP_YRESOLUTION": "200", // Sets the resolution
"CAP_AUTOFEED": false, // TW_BOOL, No default; TRUE to use ADF or FALSE to use Flatbed
"ICAP_FRAMES": "(0, 0, 4, 6)" // Scan part of the image only
},
"prompt_scan_more": true, /** Default value: false */
"i18n" : {
"lang": "en" /** en (default) | de | es | fr | it | ja | pt | zh | user (user's OS locale) */
},
// --------------- Processing Settings ---------------
// Blank page detection/discard
"detect_blank_pages": "false", /** Default value: false */
"blank_page_threshold": "0.02",
"blank_page_margin_percent": "8",
"blank_page_policy": "keep", /** "keep" (default) or "remove" */
// Barcode reading
"recognize_barcodes": "false", /** Default value: false */
"barcodes_dpi": 100, /** DPI used to recognize barcodes, default value is 100; use high values for smaller barcodes. */
"barcodes_settings": "", /** Additional barcode settings if any */
// Document separation
"doc_separators": [ /** applicable for PDF and TIFF formats only. */
"TWEI_PATCHCODE:5:DOC_SEP", /** */
"blank:DOC_SEP", /** Use blank sheets to separate documents. */
"barcode:abc*:DOC_NEW" /** Pages with barcodes starting with abc mark beginning of documents (inclusive). */
],
// --------------- Output Settings ---------------
"output_settings": [
{
"type": "save", // return-base64, save, upload[-thumbnail]
"format": "pdf", // bmp, png, jpg, tif, pdf // optional, default is jpg
"thumbnail_height": 200, // only for -thumbnail; optional, default is 200
"save_path": "${TMP}\\${TMS}${EXT}", // only for save
/** JPG realted */
"jpeg_quality": "90", // optional, default is 80, only for JPG format
/** TIFF Related */
"tiff_compression": "G4", // optional, default is empty; only for TIFF format
"tiff_force_single_page": "false",
/** PDF Related */
"pdf_force_black_white": "true", // optional, default is false; only for PDF format
"pdfa_compliant": "false",
"pdf_owner_password": "",
"pdf_user_password": "",
"pdf_text_line": "Asprise PDF/A scan by ${COMPUTERNAME}/${USERNAME} on ${DATETIME}",
"exif": {
"DocumentName": "PDF/A Scan",
"UserComment": "Scanned using Asprise software"
},
/** Upload Related */
"upload_after_all_done": "true", // default is true
"upload_one_by_one": "false", // default is false
"upload_target": {
"url": "http://asprise.com/scan/applet/upload.php?action=dump",
"method": "post",
"max_retries": 2,
"post_fields": { // Optional additional POST fields
"provider": "Asprise"
},
"post_file_field_name": "asprise_scans", // Field name of of uploaded files
"post_files": [ // Optional additional files to be uploaded
"C:\\_tmp0.jpg"
],
"cookies": "name=Asprise; domain=asprise.com", // Optional cookies to pass
"auth": "user:pass", // Optional auth info
"headers": [ // Optional additioanl headers
"Referer: http://asprise.com"
],
"log_file": "null", // Log HTTP operations to a file for debug purpose
"max_operation_time": 600, // Max operation timeout in seconds.
"to_file": "null" // Save the HTTP response to a file.
}
}
],
// --------------- Other Return Options: image info ---------------
"retrieve_caps": [ // caps to be retrieved for each scan
"ICAP_PIXELTYPE",
"ICAP_XRESOLUTION",
"ICAP_UNITS",
"ICAP_FRAMES"
],
"retrieve_extended_image_attrs": [ // device returned extended image attributes
"BARCODE",
"TWEI_PATCHCODE"
]
}
twain_cap_setting: Scan Settings¶
twain_cap_setting
specifies scanning settings. You may use either TWAIN constant name or the actual contant value in both attribute names and values.
For example, "ICAP_PIXELTYPE": "TWPT_GRAY"
is equivalent to "0x0101": "1"
.
Various capability setting and resetting operations are supported:
- Set Single Value
Using a single value to set a capability directly. Example:
"ICAP_PIXELTYPE": "TWPT_GRAY"
- Set Value with Fallback
A list of values separated with comma instructs Asprise Scanning to try each value in order until setting is successful. Example:
"ICAP_PIXELTYPE": "TWPT_GRAY,TWPT_RGB"
- Reset a Capability
Appending the capability name with
/RESET
to instruct the device to reset a capability to its device default value. Example:"ICAP_XSCALING/RESET": null
You may specify capability setting attributes in any order as Asprise Scanning will intelligently coordinate capability setting properly.
Common used capability list (note that a particular scanner may not support all the capabilities list below):
Capability | Description | Values |
---|---|---|
ICAP_PIXELTYPE | The type of pixel data that a Source is capable of acquiring (for example, black and white, gray, RGB, etc.). Commonly known as color mode. | TWPT_BW; TWPT_GRAY; TWPT_RGB |
ICAP_BITDEPTH | Specifies the pixel bit depths for the Current value of ICAP_PIXELTYPE. For example, 4-bit or 8-bit gray when ICAP_PIXELTYPE = TWPT_GRAY. | 1; 4; 8 |
ICAP_XRESOLUTION | The X-axis resolution measured in units of pixels per unit as defined by ICAP_UNITS. ICAP_XRESOLUTION and ICAP_YRESOLUTION should be the same. | e.g., 100, 200, 300 |
ICAP_YRESOLUTION | The Y-axis resolution measured in units of pixels per unit as defined by ICAP_UNITS. ICAP_XRESOLUTION and ICAP_YRESOLUTION should be the same. | e.g., 100, 200, 300 |
ICAP_UNITS | ICAP_UNITS determines the unit of measure for all quantities. | TWUN_INCHES (default); TWUN_CENTIMETERS; TWUN_PIXELS |
ICAP_FRAMES | Specifies the part of the surface the the device should acquire. | (Left, Top, Right, Bottom) |
CAP_FEEDERENABLED | Set to true to acquire images from the document feeder or false to acquire from the flatbed. | true; false |
CAP_AUTOFEED | Specifies whether Source should automatically feed the next page from the document feeder after the each page is acquired when CAP_FEEDERENABLED is true. | true; false |
ICAP_BRIGHTNESS | The brightness value Source should use when scanning. | 0 (default) or int value. |
ICAP_CONTRAST | The contrast value Source should use when scanning. | 0 (default) or int value. |
ICAP_ROTATION | How the Source can/should rotate the scanned image data prior to transfer. | 0 (default); +/-360 |
ICAP_ORIENTATION | Defines which edge of the “paper” the image’s “top” is aligned with. | TWOR_PORTRAIT (default); TWOR_ROT90; TWOR_ROT180; TWOR_LANDSCAPE |
CAP_INDICATORS | Whether the Source should display a progress indicator during acquisition and transfer. | true (default); false |
prompt_scan_more: Scan Multiple Pages in a Session¶
Set prompt_scan_more
to true
to prompt the user to scan multiple pages in a session.
i18n: Internationalization and Localization¶
Set lang
to any of the following values:
en English
de Deutsch
es Español
fr Français
it Italiano
ja 日本語
pt Português
zh 中文
user user’s OS locale
If you need to display the scanner’s UI, you may also consider to set the TWAIN capability CAP_LANGUAGE
to a desired language (TWLG_USERLOCALE, TWLG_DAN,
TWLG_DUT, TWLG_ENG, TWLG_FCF, TWLG_FIN, TWLG_FRN, TWLG_GER, TWLG_ICE, TWLG_ITN, TWLG_NOR, TWLG_POR,
TWLG_SPA, TWLG_SWE, or TWLG_USA).
retrieve_caps: Image Information to be Returned¶
Use it when you are interested to know the metadata about the images or the actual scanning setting used when scanning the images.
retrieve_caps
specifies an array of capabilites should be returned for each image acquried. You may use either TWAIN constant name or the actual contant value
as the array element. For example, ["ICAP_PIXELTYPE", "ICAP_XRESOLUTION"]
is equivalent to ["0x0101", "0x1118"]
.
You may refer to twain_cap_setting: Scan Settings for common used capabilities.
retrieve_extended_image_attrs: Extended Image Attributes to be Returned¶
Similar to retrieve_caps
, retrieve_extended_image_attrs
specifies an array of extended image attributes to be returned for each image acquried. You may use either TWAIN constant name or the actual contant value
as the array element. For example, ["TWEI_PATCHCODE"]
is equivalent to ["0x1212"]
.
Note that only high end scanners may return extended image attributes.
For your convenience, "BARCODE"
will be expanded to the following list of attributes: “TWEI_BARCODECOUNT”, “TWEI_BARCODECONFIDENCE”, “TWEI_BARCODEX”, “TWEI_BARCODEY”, “TWEI_BARCODETYPE”, “TWEI_BARCODEROTATION”, “TWEI_BARCODETEXTLENGTH”, “TWEI_BARCODETEXT”.
recognize_barcodes: Barcode Recognition¶
You may request high end scanners to read barcodes using retrieve_extended_image_attrs
. However, the rest of scanners are unable to decode barcodes.
By setting recognize_barcodes
to true
, you instruct Asprise Scanning to recognize a wide range of barcode and QR code formats even
if the scanner devices don’t support barcode recognition. The following barcode formats are supported:
CODE 128 (128b, 128C, 128raw)
EAN-8 EAN-13
UPC-A, UPC-E
code 3 of 9
code interleaved 2 of 5
QR code
DataMatrix
Aztec
PDF 417
detect_blank_pages: Detect/Discard Blank Pages Automatically¶
To detect blank pages, you can set detect_blank_pages
to true
. You may also tune
the following paramters:
- blank_page_threshold
Specifies the maximum ink coverage to be considered as blank. The default value is 0.02, meaning if the ink coverage is less than 2%, then it is considered as blank. Valid value range: 0 ~ 1.0
- blank_page_margin_percent
Page margins are often prone to noise. This parameter allow you to exclude certain percentage of page on the margins. Default value is 8, meaning 8% of page width is considered as left and right margins and 8% of page height is for top and bottom margins. Valid value range: 0 ~ 100
- blank_page_policy
What to do with blank page? “keep” (default) or “remove”.
doc_separators: Document Separation¶
Asprise scan makes batch separation easily and improves documents organization for both high end scanners as well as economic models. Even if the scanner doesn’t support automatic document separatation, you can use Asprise scan to perform auto document separation on a batch scan and save scan results in multiple PDF or TIFF files.
You may use any combination of the following as a document separator:
- Patch codes
Patch codes are a set of 6 distinct barcode patterns (1, 2, 3, 4, 6 and T) that are traditionally used as document separators when scanning. There are six distinct barcode patterns (Patch 1, 2, 3, 4, 6 and T). A common use now is to use the Patch T code or the Patch 2 code as a Page (document) separator. To use patch T as the only separator, you specify
"TWEI_PATCHCODE:5:DOC_SEP"
. To use any of the patch code as separator, you specify"TWEI_PATCHCODE:?:DOC_SEP"
(? matches a single character and * matches any number of characters;DOC_SEP
means the separating page should not be included in the document). Note to use this feature, you need: 1) setICAP_PATCHCODEDETECTIONENABLED
to1
in twain_cap_setting: Scan Settings; 2) retrieve"TWEI_PATCHCODE"
inretrieve_extended_image_attrs
as described in retrieve_extended_image_attrs: Extended Image Attributes to be Returned.- Any other TWAIN extended image attributes (TWEI)
Many high end scanners are capable of providing other extended image attributes, for example,
TWEI_BARCODETEXT
returns the barcode text natively recognized by the scanner whenICAP_BARCODEDETECTIONENABLED
is set to1
. For example, if there is a barcode with textdoc_[x]
on first pages of the documents, you can use"TWEI_BARCODETEXT:doc_*:DOC_NEW"
(? matches a single character and * matches any number of characters;DOC_NEW
means the separating page is included as the first page of the new document).- Barcodes
To separate documents, you can also use Asprise scan’s built in barcode recognition. Turn on barcode reading as instructed in recognize_barcodes: Barcode Recognition and specify the barcode content match for document separation, e.g.
"barcode:abc*:DOC_NEW"
.- Blank pages
To use blank pages for document separation, you simply specify
"blank:DOC_SEP"
.- Manual document separation using Asprise Dialog
On Asprise Dialog, the user can manually mark an image as a document separator (separating documents only, it is not part of any document) or mark it as beginning of a new document (it is the first page of the document) or unmark it.
Output Settings¶
For a scan session, you may specify any combination of the following output setting types to the output_settings
attribute:
- save
Save the scanned images to local hard disk drives or mapped network locations. Required attribute:
save_path
. More details are available in save, save-thumbnail- upload
Upload the scanned images to a URL. Required attribute:
upload_target
.- return-base64
Returns the data of the scanned images in base64 format. Required attribute: none.
- return-handle
For C/C++ only: returns the handles to the scanned images. Required attribute: none.
- *-thumbnail
save-thumbnail, upload-thumbnail and return-base-thumbnail performs the same operations on the thumbnails (scaled down version).
thumbnail_height
is optional with default value of 200 (the height of the thumbnail should be 200 pixels).
You are free to combine any number of the output settings. For example, below request instructs to upload the scanned images in PDF format to remote web server and return the base64 encoded data of the thumbnails (JPG format):
{
"output_settings" : [ {
"type" : "upload",
"format" : "pdf",
"upload_target" : {
"url" : "http://asprise.com/scan/applet/upload.php?action=dump"
}
}, {
"type" : "return-base64-thumbnail",
"format" : "jpg"
} ]
}
Image Formats Supported¶
You use format
attribute to specify the desired output format. The following list of image formats are supported:
- jpg
JPEG format. This is the default format, i.e., jpg will be used if
format
attribute is not present.Use
jpeg_quality
to specify the JPEG quality. The default value is 80.- bmp
Bitmap format.
- png
Portable Network Graphics.
- tif
Tagged Image File Format.
Use
tiff_compression
to specify the compression to be used for TIFF. Valid values are:G4
,G3
,LZW
,RLE
,ZIP
andNONE
(default).Portable Document Format. The following list of attributes may be used to customize PDF output:
pdf_force_black_white
: Default is false, set to true to use CCITT Group 4 Compression for ultra small file size.pdfa_compliant
: Default is false, set to true to force PDF-A/1 compliance.pdf_text_line
: Optionally, you can specify a line to text to be printed on the bottom left of the first page. Macros can be used e.g., “Scanned by ${COMPUTERNAME}/${USERNAME} on ${DATETIME}”.pdf_owner_password
,pdf_user_password
: Optionally, you may specify password whenpdfa_compliant
is false.exif
: the following tags can be specified:DocumentName
,UserComment
andCopyright
.
save, save-thumbnail¶
You use save
and save-thumbnail
to save the scanned images or thumbnails to local hard disk drives or mapped network locations.
save_path
is a required attribute, which you use to specify the target output location. Note save_path
specifies the complete target file location, not the containing folder.
If you set "save_path": .\\test.jpg
and there are multiple images scanned, you’ll get only the last one as previous images will be overwritten. To avoid this problem,
you may use any of the macros listed below:
- ${TMS}
Timestamp with milliseconds, e.g, ‘2020-12-25_08-06-27.880’.
- ${TM}
Timestamp without milliseconds, e.g, ‘2020-12-25_08-06-27’.
- ${DATE}
Readable timestamp in a format similar to ISO 8601.
- ${I}
Image index in a scanning session.
- ${EXT}
Image file extension according to the image format specified using
format
(default is JPG).- ${TMP}, ${USERPROFILE} and all other env variables
Environment variables are supported. You may define custom environment variables and use them in the path as long as your enclose the names with
${
and}
.
Unsupported macros will be replaced with _
.
The actual path that the image is written will be returned in the result.
upload, upload-thumbnail¶
You use upload
and upload-thumbnail
to upload the scanned images or thumbnails to a remote server. The back end can be implemented in any programming language like
C, C#/VB.NET, Java, Node.js, PHP, Python, Ruby on Rails, etc. as image data are sent using standard HTTP POST.
To specify the upload destination, you need to set upload_target
to an object with the following attributes:
- url
The only required attribute; specifies the target URL.
- max_retries
Max number of retries before giving up; the default value is 2.
- post_file_field_name
The name of the POST field for image files. Default value is ‘asprise_scans’ if there is only one file or ‘asprise_scans[]’ if there are many files. You don’t need to add ‘[]’ as the system will auto fix it.
- post_fields
Optionally specifies additional fields to be POSTed.
- cookies
Optional cookies to be sent over along.
- auth
Optionally authentication token
- headers
Optionally adds an array of additional HTTP request headers
- log_file
Optionally specify a local file to print logging information for debug purpose.
- max_operation_time
Max HTTP operation time allowed in seconds. Default value is 1200 (20 minutes).
return-base64, return-base64-thumbnail¶
Use return-base64
and return-base64-thumbnail
to return the data of the scanned images and thumbnails in base64 format.
Once obtained the base64 encoded data from the result, you can then decode base64 data into binary and read the image from it.