Development of the ocr part of AOI
Samo Penic
2018-11-28 7621b38ff23a963724adcafe8946acce48e48abe
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
3
í¬õ[N+ã@sZddlmZddlmZddlZddlZddlZddlZdZ    ej
e e    ƒZ Gdd„dƒZ dS)é)Údecodeé)ÚgetSIDNz/template-sq.pngc@sˆeZdZd#dd„Zd$dd„Zd%d    d
„Zd d „Zd d„Zd&dd„Zdd„Z    dd„Z
d'dd„Z d(dd„Z dd„Z dd„Zdd „Zd!d"„ZdS))ÚPaperNú/tmpcCsd||_||_d|_d|_|dkr(ddin||_g|_g|_d|_||_|dk    r`|j    |ƒ|j
ƒdS)NÚanswer_thresholdgÐ?) ÚfilenameÚ output_pathÚinvalidÚQRDataÚsettingsÚerrorsÚwarningsÚsidÚsid_classifierÚ    loadImageÚrunOcr)Úselfrrr r    ©rú8/home/samo/programiranje/python/sizif-ocr/aoi_ocr/Ocr.pyÚ__init__s
zPaper.__init__rcCsJtj||ƒ|_|jdkr.|jjdƒd|_dS|jjdd…\|_|_dS)NzFile could not be loaded!Tré)    Úcv2ÚimreadÚimgr Úappendr
ÚshapeÚ    imgHeightÚimgWidth)rrZ
rgbchannelrrrrs 
 zPaper.loadImageú/tmp/debug_image.pngcCstj||jƒdS)N)rÚimwriter)rrrrrÚ    saveImage$szPaper.saveImagecCsD|jdkrdS|jƒ|jƒtjd|jƒd}|jƒ|jƒdS)NTz/tmp/debug_threshold.pngr)r
ÚdecodeQRandRotateÚ imgTresholdrr ÚbwimgÚgenerateAnswerMatrixr!)rZ    skewAnglerrrr's
z Paper.runOcrcCsÞ|jdkrdStj|jd
ƒ}t|ƒ}||_t|ƒdkrR|jjdƒd|_d|_dSt|ƒdkrŠx*|D]"}|j    dks||j    dkrd||d<PqdW||_
|dj|_ |dj j }|dj j}||jdkrÚ||jdkrÚ|jd    ƒdS) NTérzQR code could not be found!rÚEAN13ZQRg@é´)r&r&)r
rÚblurrrÚlenr rÚdataÚtypeÚQRDecoder ZrectÚleftÚtoprrÚ rotateAngle)rr)ÚdZddZxposZyposrrrr"8s*
 
   zPaper.decodeQRandRotatec    Csdtj|jd|jdf|dƒ}tj|j||j|jftjtjdd}||_|jjdd…\|_|_dS)Nrgð?éÿ)ÚflagsZ
borderModeZ borderValuer)r2r2r2)    rZgetRotationMatrix2DrrZ
warpAffinerZ INTER_CUBICZBORDER_CONSTANTr)rÚangleZrot_matÚresultrrrr0Ps
zPaper.rotateAnglecCs&tj|jddtjtjBƒ\|_|_dS)Né€r2)rÚ    thresholdrZ THRESH_BINARYZ THRESH_OTSUZthreshr$)rrrrr#dszPaper.imgTresholdc     Csàd|j}tjd|ƒd}d}tj|jtjƒ}x|tj|dtjddƒD]b}|d\}}}}    tj    |||f||    fd dƒtj
|    |||ƒ}
|
rFt |
ƒd    krF||
7}|d7}qFWytj ||ƒ} Wnd} YnXtjd
|ƒ| S) Nr2z/tmp/debug_1.pngrgrr(iEré
z/tmp/debug_2.png)rrr2) r$rr ÚcvtColorrÚCOLOR_GRAY2BGRZ HoughLinesPÚnpZpiÚlineZarctan2ÚabsZrad2deg) rÚnegZ angle_counterr4Úcimgr<Zx1Zy1Zx2Zy2Z
this_angleZskewrrrÚ getSkewAnglejs&
 
 zPaper.getSkewAngleç333333ë?éÈcCs„tjtdƒ}|jddd    …\}}|jd|…dd…f}tj||tjƒ}tj||kƒ}tj    |tj
ƒ}    g}
g} t |dƒdkr€d
} nîtj |dƒ} xFt |ddd …ŽD]0} | d| dkr¢| j| dƒ|
j| dƒq¢Wt tt |
| ƒƒŽ\}
} tj|
ƒdk}tj|dƒ}tj|
ƒ}
tj| ƒ} | ||
|g}xBt |ddd …ŽD],} tj|    | | d|| d|fd dƒq>Wtjd|    ƒ||_|S)Nrréé(Tr2rz/tmp/debug_3.pngéÿÿÿÿrErErE)rr2r2)rrÚmarkerfilenamerr$Ú matchTemplateÚTM_CCOEFF_NORMEDr;Úwherer9r:r*ÚminÚziprÚsortedÚdiffÚarrayÚ    rectangler ÚxMarkerLocations)rr7ZheightÚtemplateÚwÚhÚcrop_imgÚresÚlocr?Úloc_filtered_xÚloc_filtered_yZmin_yÚptÚarrrÚlocateUpMarkersˆs6  
 
, zPaper.locateUpMarkerscCsÜtjtdƒ}|jddd
…\}}|jdd…| d…f}tjd|ƒtj||tjƒ}tj    ||kƒ}tj
|tj ƒ}    g}
g} t |dƒdkrd } ntj |dƒ} xFt|ddd …ŽD]0}|d| dkr²| j|dƒ|
j|dƒq²Wyttt| |
ƒƒŽ\} }
Wn*tjddgƒtjddgƒg|_|jStj| ƒdk}tj|dƒ}tj|
ƒ}
tj| ƒ} | ||
|g}xBt|ddd …ŽD],}tj|    ||d||d|fddƒq~Wtjd    |    ƒ|d|d|j|g|_|jS)Nrrz/tmp/debug_right.pngrCrrDTr2z/tmp/debug_4.pngrErErErE)rr2r2)rrrFrr$r rGrHr;rIr9r:r*ÚmaxrKrrLrNÚyMarkerLocationsrMrOr)rr7ÚwidthrQrRrSrTrUrVr?rWrXZmin_xZmax_xrYrZrrrÚlocateRightMarkers­s@   
 
, zPaper.locateRightMarkersc     Cs¼|jƒ|jƒd}d}d}|}||}g|_xˆ|jdD]z}g}xd|jdD]V}|j|||t||ƒ…|||t||ƒ…f}    |tj|    ƒ}
|j    |
|ƒqNW|jj    |ƒq:WdS)Nr8éé2rr)
r[r_Ú answerMatrixr]rPr$ÚintrZ countNonZeror) rZroixoffZroiyoffZroiwidthZ    roiheightZtotpxÚyZonelineÚxZroiZblackrrrr%×s"zPaper.generateAnswerMatrixcsœˆjdkrdSˆjdk    r&ˆjjddƒ}tˆjtdˆjƒtdˆjƒ…tdˆjƒtdˆjƒ…fˆj|ƒ\}}}‡fdd„|Dƒ‡fd    d„|Dƒ|S)
NreÚsid_maskg{®Gáz¤?gR¸…ëQ¸?gÍÌÌÌÌÌä?gffffffî?csg|]}ˆjj|ƒ‘qSr)r r)Ú.0Úe)rrrú
<listcomp>ûsz*Paper.get_enhanced_sid.<locals>.<listcomp>csg|]}ˆjj|ƒ‘qSr)rr)rgrR)rrrriüs)rr Úgetrrrcrr)rrfZesÚerrÚwarnr)rrÚget_enhanced_sidîs
 
 zPaper.get_enhanced_sidcCsÖ|jdkr*|jjdƒddddddœ}|Stj|jdƒ}|jdjdkr|t|dd…ƒt|dƒdt|dd…ƒdddœS|jd    ƒ}t|dƒt|d
ƒt|d ƒt|dƒddœ}t    |ƒd krÎ|d |d <|SdS)Nz+Could not read QR or EAN code! Not an exam?)Zexam_idÚpage_noZpaper_idZ
faculty_idrÚutf8rr'érr`ú,r&réréûÿÿÿrE)
r r rÚbytesrr-r,rcÚsplitr*)rZretvalZqrdatar+rrrÚ get_code_dataÿs0
 
 
 
 
 
  zPaper.get_code_datacCs|jƒ}tj|jdƒ|d<|j|d<|j|d<t|jd|jƒt|jd|j    ƒf|d<t|j
d|jƒt|j
d|j    ƒf|d<t j |j ƒ|jd    kdjƒ|d
<|d dkrÌ|d dkrÌ|jƒ|d <tjj|jd j|jjdƒdjd ƒdd…ƒdƒ}tj||jƒ||d<|S)NroZqrr rrrZ up_positionZright_positionrZ
ans_matrixrrnÚ.ú/z.pngÚoutput_filenamerErE)rvrtrr r rÚlistrPrrr]r;rNrbr ÚtolistrmÚosÚpathÚjoinr    rrurr r)rr+ryrrrÚget_paper_ocr_data!s
 
" 4zPaper.get_paper_ocr_data)NNNr)r)r)r)rArB)rArB)Ú__name__Ú
__module__Ú __qualname__rrr!rr"r0r#r@r[r_r%rmrvrrrrrr s
 
 
 
 
%
*"r)Z pyzbar.pyzbarrZ sid_processrrZnumpyr;r|Ú pkg_resourcesZ
markerfileÚresource_filenamer€rFrrrrrÚ<module>s