Are Machine Learning Cloud APIs Used Correctly?

http://people.cs.uchicago.edu/~cwan/paper/ml_api.pdf

Machine learning provides effective solutions
Software development: problems --> bugs
ML cloud API
- Function as a service
- Help incorporating learning solutions into software systems
  - Require less domain knowledge
  - No need to design and train neural networks
ML APIs raise unique challenges
- Performing cognitive tasks: how people ask questions greatly affect the result
- Largely defined by training data: properties might not be known by API users
- Numeric vector output: high-dim, tricky to interpret
- Complicated accuracy - performance tradeoffs
Corpus
- Google / Amazon ML cloud API
- 3 ML domains: vision, language, speech
- 18 months, size of 2,200 lines
Anti-pattern identification methodology
- Manual examine
- Design test cases
- Report bugs
Result
- Most applications: misuses!
- Pattern
  - Calling the wrong API
    Subtle semantics difference among cognitive tasks
    e.g. image classification, object detection. Which one to use?
    e.g. text-detection, document-text-detection
    Escape the traditional testing
  - Misinterpreting outputs
    Numeric vector outputs are difficult to interpret
  - Misuse of async APIs
    complicated accuracy-performance tradeoffs
  - Necessarily high-resolution inputs
    higher resolution - performance degrades
  - Many other misuses --> what types of impact (reduce functionality, degraded performance, increased cost)
Design checkers
- Three static analysis tools for three misuses
- API wrappers for four misuses

Last updated 3 years ago

Was this helpful?