18 Application Example: Photo OCR

Problem Description and Pipeline

Text detection
Character segmentation
Character classification

Sliding Windows

Go out and collect large training sets of positive and negative examples. Take that green rectangle and we slide it over a bit and then run that new image patch through our classifier to decide if there’s a pedestrian there.

Getting Lots of Data and Artificial Data

artificial data synthesis
just collect the data and you label it yourself
crowd sourcing

Ceiling Analysis What Part of the Pipeline to Work on Next

Where should you allocate resources? Which of these boxes is most worth your efforts, trying to improve the performance of.

choose one module
provide it the correct text detection outputs
And then, use the same evaluation metric as before, to measure what is the overall accuracy of the entire system
go to step 1, but choose another one