Skip to main content

Text Recognition API

API Description

The Text Recognition API supports returning the position of the text.

Request Description

URL

https://openapi.ocr.sys303.com/api/v1/ocr/general?access_token={token}

Parameters

URL Parameters
ParameterValue
access_tokenThe access_token obtained through the API Key and Secret Key.
Header Parameters
ParameterValue
Content-Typeapplication/x-www-form-urlencoded
Body Parameters
ParameterRequiredTypePossible ValuesDescription
imageOne of four optionsstring-Image data, base64 encoded and URL encoded, size should not exceed 10MB, shortest side at least 15px, longest side max 8192px. Supports jpg/jpeg/png/bmp formats. Priority: image > url > pdf_file > ofd_file
language_typeNostring0Recognition language type, default is Tibetan [0: Tibetan, 1: Chinese].
urlOne of four optionsstring[Under development]Full URL of the image, URL length should not exceed 1024 bytes, base64 encoded image from the URL should not exceed 10MB, shortest side at least 15px, longest side max 8192px. Priority: image > url > pdf_file > ofd_file
pdf_fileOne of four optionsstring[Under development]PDF file, base64 encoded and URL encoded, size should not exceed 10MB, shortest side at least 15px, longest side max 8192px. Priority: image > url > pdf_file > ofd_file
ofd_fileOne of four optionsstring[Under development]OFD file, base64 encoded and URL encoded, size should not exceed 10MB, shortest side at least 15px, longest side max 8192px. Priority: image > url > pdf_file > ofd_file
pdf_file_numNostring[Under development]The corresponding page number of the PDF file to be recognized. When the pdf_file parameter is valid, the content of the page corresponding to the page number is recognized. If not provided, the first page is recognized by default.
ofd_file_numNostring[Under development]The corresponding page number of the OFD file to be recognized. When the ofd_file parameter is valid, the content of the page corresponding to the page number is recognized. If not provided, the first page is recognized by default.
recognize_granularityNostring[Under development]Whether to locate the position of individual characters. Options: big (do not locate individual characters, default value), small (locate individual characters).
detect_directionNostring[Under development]Whether to detect the orientation of the image. By default, it is not detected (false). Orientation refers to whether the input image is in normal, counterclockwise 90/180/270 degrees. Options: true (detect orientation), false (do not detect orientation). If input is not in a correct orientation, it is recommended to set this parameter to "true" for better recognition results.
vertexes_locationNostring[Under development]Whether to return the vertex locations of the outer polygon around the text. Single character position is not supported. Default is false.
paragraphNostring[Under development]Whether to output paragraph information.
probabilityNostring[Under development]Whether to return the confidence level for each line in the recognition result.

Request Example

curl --request POST \
--url 'https://openapi.ocr.sys303.com/api/v1/ocr/general?access_token=【access_token】' \
--header 'content-type: multipart/form-data' \
--form 'image=【image path】' --form language_type=0

Return Explanation

Parameter Description

FieldRequiredTypeStatusDescription
log_idYesuint64-Unique log ID used for troubleshooting.
words_result_numYesuint32-Number of recognition results, representing the number of elements in words_result.
paragraphs_result_numYesuint32Number of recognition results, representing the number of elements in paragraphs_result.
words_resultYesarray[]-Array of recognition results.
+ wordsNostring-Recognized text string.
+ locationNostring[In Development]String location information.
++ topNostring[In Development]Vertical coordinate of the top-left corner of the bounding rectangle for location.
++ leftNostring[In Development]Horizontal coordinate of the top-left corner of the bounding rectangle for location.
++ widthNostring[In Development]Width of the bounding rectangle for location.
++ heightNostring[In Development]Height of the bounding rectangle for location.
+ probabilityNoobject[In Development]Confidence values for each line of recognition results, includes average (line confidence average), variance (line confidence variance), min (minimum line confidence). This field is returned when probability=true.
paragraphs_resultNoarray[][In Development]Paragraph detection results, returned when paragraph=true.
+ words_result_idxNoarray[][In Development]Line numbers included in a paragraph, returned when paragraph=true.
pdf_file_sizeNostring[In Development]Total page count of the input PDF file, returned when the pdf_file parameter is valid.
ofd_file_sizeNostring[In Development]Total page count of the input OFD file, returned when the ofd_file parameter is valid.
directionNoint32[In Development]Image orientation, returned when detect_direction=true. -1: Undefined - 0: Normal - 1: 90° Counterclockwise - 2: 180° Counterclockwise - 3: 270° Counterclockwise

Return Example

Success
{
"logId": 0,
"words_result_num": 2,
"words_result": [
{
"words": "Tibet OCR",
"location": {
"left": 0,
"top": 0,
"width": 0,
"height": 0
}
},
{
"words": "Changyuan Shengbang Technology Co., Ltd.",
"location": {
"left": 0,
"top": 0,
"width": 0,
"height": 0
}
}
]
}
Failure
{
"error_code": 110,
"error_msg": "Access token invalid or no longer valid"
}