Random Forest - Getting Started

Getting started with Random Forest!

What is Random Forest

Random forest classifier is an ensemble tree-based learning algorithm. The random forest classifier is a set of decision trees from a randomly selected subset of the training set. It aggregates the votes from different decision trees to decide the final class of the test object1.

随机森林顾名思义,是用随机的方式建立一个森林,森林里面有很多的决策树组成,随机森林的每一棵决策树之间是没有关联的2

Open Source Geoprocessing Tutorial

Introduction

Tutorial of basic remote sensing and GIS methodologies using the open source python geospatial ecosystem4.

Tutorial of basic remote sensing and GIS methodologies using modern open source software in Python (rasterio, shapely, geopandas, folium, etc). Notebooks cover raster processing, vector analysis, cloud-based tools like Google Earth Engine, a workflow to perform image classification using machine learning classifiers in scikit-learn, and an introduction to handling large array datasets with xarray.

Project located locally in C:\Users\username\Documents\VS_workspace\open-geo-tutorial

Also available at Binder launch

A Modern Geospatial Workflow in Python

What kind of python libraries make up a modern geospatial workflow? This is just a sample of what we consider to be the basic building blocks and what will be covered here:

  • shapely for geometric analysis
  • fiona for reading in vector formats
  • rasterio for reading in and working with raster formats
  • GeoPandas to extend pandas to work with geo formats
  • numpy and the python scientific computing stack for efficient computation
  • matplotlib for general plotting and visualization
  • folium for advanced and interactive plotting
  • scikit-learn for machine learning based data exploration, classification, and regression

Why Python?

  • Python is an actual programming language with a large standard library

    • tools for file manipulation, command line argument parsing, and web access and parsing already exist and are robust
    • Python already widely used in other scientific communities and outside of science for web servers, desktop applications, and server management
  • Scientific Python provides very well documented and easy to use interfaces to pre-existing numeric methods

    • Linear alebgra, classification routines, regression methods, and more have been published for decades as Fortran or C codes
    • Libraries such as SciPy, NumPy, and SciKits wrap and extend these pre-existing codes in an easy to use package
    • Strong machine learning and deep learning packages like keras and scikit-learn
  • Large community with innumerable examples on blogs, StackOverflow, Github, etc.

  • Develop Python plugins for QGIS

  • Script analyses in QGIS or ArcMap

  • Open-source!!!

Python like you mean it (PLYMI)

Python Like You Mean It (PLYMI) is a free resource for learning the basics of Python & NumPy, and moreover, becoming a competent Python user. The features of the Python language that are emphasized here were chosen to help those who are particularly interested in STEM applications (data analysis, machine learning, numerical work, etc.)5.

Random Forest in Python

RF1

本次利用Scikit-learn包完成土地利用分类,以下以RF(随机森林)分类方法为例3

Dataset Prepare

This dataset was downloaded wothin link (Code: v616)

Python environment

Packages that we need in miniconda:

  • scipy

  • scikir-learn

  • matpotlib (optional)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# check installed packages

(base) PS C:\Users\mengy> pip list

# GDAL
$ conda install -c "conda-forge/label/TEST" gdal

# numpy
$ conda install -c anaconda numpy


#scipy
$ conda install -c anaconda scipy

#scikir-learn
$ conda install -c anaconda scikit-learn

#matpotlib
$ conda install -c conda-forge matplotlib

#rasterio
$ conda install -c conda-forge rasterio

# Fiona
$ conda install -c conda-forge fiona

# geopandas
$ conda install -c "conda-forge/label/broken" geopandas

Project

  1. Create new project in Gitlab and clone to VS Code

  2. Add model.py file

  3. Change python environment to 3.9.12 (base:conda) in bottom right of VS Code

Random Fotest in R

Random Forest Results Evaluation

Confusion Matrix


Random Forest - Getting Started
https://mengyuchi.gitlab.io/2023/01/07/Classification-Random-Forest/
Author
Yuchi Meng
Posted on
January 7, 2023
Licensed under