Random Forest - Getting Started
Getting started with Random Forest!
What is Random Forest
Random forest classifier is an ensemble tree-based learning algorithm. The random forest classifier is a set of decision trees from a randomly selected subset of the training set. It aggregates the votes from different decision trees to decide the final class of the test object1.
随机森林顾名思义,是用随机的方式建立一个森林,森林里面有很多的决策树组成,随机森林的每一棵决策树之间是没有关联的2。
Open Source Geoprocessing Tutorial
Introduction
Tutorial of basic remote sensing and GIS methodologies using the open source python geospatial ecosystem4.
Tutorial of basic remote sensing and GIS methodologies using modern open source software in Python (
rasterio
,shapely
,geopandas
,folium
, etc). Notebooks cover raster processing, vector analysis, cloud-based tools likeGoogle Earth Engine
, a workflow to perform image classification using machine learning classifiers inscikit-learn
, and an introduction to handling large array datasets withxarray
.
Project located locally in C:\Users\username\Documents\VS_workspace\open-geo-tutorial
Also available at Binder launch
A Modern Geospatial Workflow in Python
What kind of python libraries make up a modern geospatial workflow? This is just a sample of what we consider to be the basic building blocks and what will be covered here:
shapely
for geometric analysisfiona
for reading in vector formatsrasterio
for reading in and working with raster formatsGeoPandas
to extendpandas
to work with geo formatsnumpy
and the python scientific computing stack for efficient computationmatplotlib
for general plotting and visualizationfolium
for advanced and interactive plottingscikit-learn
for machine learning based data exploration, classification, and regression
Why Python?
-
Python is an actual programming language with a large standard library
- tools for file manipulation, command line argument parsing, and web access and parsing already exist and are robust
- Python already widely used in other scientific communities and outside of science for web servers, desktop applications, and server management
-
Scientific Python provides very well documented and easy to use interfaces to pre-existing numeric methods
- Linear alebgra, classification routines, regression methods, and more have been published for decades as Fortran or C codes
- Libraries such as SciPy, NumPy, and SciKits wrap and extend these pre-existing codes in an easy to use package
- Strong machine learning and deep learning packages like keras and scikit-learn
-
Large community with innumerable examples on blogs, StackOverflow, Github, etc.
-
Develop Python plugins for QGIS
-
Script analyses in QGIS or ArcMap
-
Open-source!!!
Python like you mean it (PLYMI)
Python Like You Mean It (PLYMI) is a free resource for learning the basics of Python & NumPy, and moreover, becoming a competent Python user. The features of the Python language that are emphasized here were chosen to help those who are particularly interested in STEM applications (data analysis, machine learning, numerical work, etc.)5.
Random Forest in Python
RF1
本次利用Scikit-learn包完成土地利用分类,以下以RF(随机森林)分类方法为例3。
Dataset Prepare
This dataset was downloaded wothin link (Code: v616)
Python environment
Packages that we need in miniconda
:
-
scipy
-
scikir-learn
-
matpotlib (optional)
1 |
|
Project
-
Create new project in
Gitlab
and clone to VS Code -
Add
model.py
file -
Change
python
environment to3.9.12 (base:conda)
in bottom right ofVS Code