← Wiki

Ingredient

PlantVillage dataset

Also known as: PlantVillage, PV dataset

Open-access labeled dataset of ~54,000 leaf images covering 14 crop species and 38 disease-or-healthy classes — the de-facto training corpus for crop-disease computer-vision models. Released by Penn State and EPFL (2016, CC0); downloadable from Kaggle and Hugging Face. The right ingredient when a project needs to train or fine-tune a plant-disease classifier without collecting and labeling images from scratch. Limitations: laboratory-style backgrounds (single leaf on uniform background); models trained on PlantVillage often fail in field conditions and need transfer-learning on field images. Despite limitations, it's the foundational starting dataset for this domain.

What it is

  • 54,000 images across 14 crop species (apple, [[blueberry|blueberry]], cherry, corn, grape, orange, peach, pepper, potato, [[raspberry|raspberry]], soybean, squash, strawberry, tomato)
  • 38 classes — combinations of healthy and various diseased leaves (e.g., “Tomato___Late_blight”, “Apple___Cedar_apple_rust”)
  • Format: JPEG images, ~256×256 px each, organized in folders per class
  • License: CC0 (public domain)

Solves / unlocks

  • Training disease-classifier CNNs (ResNet, EfficientNet, MobileNet) for these 14 crops
  • Benchmark dataset for new architectures
  • Transfer-learning starting point for custom field datasets
  • Educational ML projects in agricultural computer vision
  • Mobile / edge-device disease screening apps

Constraints

  • Lab backgrounds — most images are single leaves on neutral/uniform backgrounds; field images have soil, multiple leaves, varied lighting, dew, occlusion. Models that hit 99% on PlantVillage often drop to 50–70% on field images.
  • Class imbalance — some classes have ~5,000 images, others <500.
  • Limited geography — collected primarily in U.S. and East Africa; doesn’t cover all regional disease variants.
  • Single-leaf assumption — real field-imaging is multi-leaf; need YOLO-style detection-then-classification pipelines.

Source

See also

Auto-generated from this entry’s typed relations: frontmatter, grouped by relation type so the editorial signal isn’t flattened.

  • Member of: [[ingredient]]
  • Combines with: [[opencv]] · [[nvidia-jetson]] · [[raspberry-pi]] · [[yolo]]

What links here, and how

Inbound connections from across the wiki, grouped by lens and by relationship. These appear automatically — every entity page declares what it links to, and that data populates here on the targets.

Practical

contains

combines with

2 inbound links · 5 outbound