Are Machine Learning Cloud APIs Used Correctly?
http://people.cs.uchicago.edu/~cwan/paper/ml_api.pdf
Presentation
Machine learning provides effective solutions
Software development: problems --> bugs
ML cloud API
Function as a service
Help incorporating learning solutions into software systems
Require less domain knowledge
No need to design and train neural networks
ML APIs raise unique challenges
Performing cognitive tasks: how people ask questions greatly affect the result
Largely defined by training data: properties might not be known by API users
Numeric vector output: high-dim, tricky to interpret
Complicated accuracy - performance tradeoffs
Corpus
Google / Amazon ML cloud API
3 ML domains: vision, language, speech
18 months, size of 2,200 lines
Anti-pattern identification methodology
Manual examine
Design test cases
Report bugs
Result
Most applications: misuses!
Pattern
Calling the wrong API
Subtle semantics difference among cognitive tasks
e.g. image classification, object detection. Which one to use?
e.g. text-detection, document-text-detection
Escape the traditional testing
Misinterpreting outputs
Numeric vector outputs are difficult to interpret
Misuse of async APIs
complicated accuracy-performance tradeoffs
Necessarily high-resolution inputs
higher resolution - performance degrades
Many other misuses --> what types of impact (reduce functionality, degraded performance, increased cost)
Design checkers
Three static analysis tools for three misuses
API wrappers for four misuses
Last updated
Was this helpful?