Science and technology

Introduction to spatial joins with QGIS

QGIS is a free and open supply geographic information system (GIS) that’s extensible, interoperable with different GISes, and utilized by a ton of individuals (together with me) who’ve geographic information to research and visualize. It’s an ideal platform with an infinite set of capabilities, which might appear daunting on first method. If you are serious about getting your toes moist in geographic information evaluation and visualization, the next fundamental ideas will show you how to get began with QGIS.

Exploring the spatial be a part of downside

Relational database customers are conversant in the idea of a desk be a part of, which is a approach of associating information in a single desk with information in one other. For instance, suppose I’ve one desk that lists workers (“Employee”) and one other that lists branches of the corporate (“Office”). I can add a worth to the Employee desk that signifies the workplace the place the worker works:

The area OfficeId within the Employee desk “points to” the right row within the Office desk. In the database world, OfficeId is claimed to be a main key within the Office desk and a overseas key within the Employee desk.

Since the workplace is at a location, there’s a spatial factor occurring right here—I do know the deal with of the workplace, so I can discover it on OpenStreetMap, for instance. What if, as an alternative of an OfficeId area, I had a area specifying the workplace location on each the Employee and Office tables. Therefore, my relational database could be prolonged with an operator that’s used to confirm that two places are equal; for instance, that the Cartesian distance between those two points is lower than some small quantity. This form of spatial relationship downside comes up in all types of fascinating issues.

For instance…

I am engaged on a writing challenge with some colleagues in Chile to discover points associated to hydroelectric energy growth there. One of the gadgets we need to embody within the challenge is a map exhibiting the situation of current hydroelectric technology services. Along with that map, we need to summarize the data associated to these services by watershed. A watershed is the realm of land that drains right into a river system and finally to the ocean (or one other physique of water that could be landlocked). Watersheds are vital for all types of land administration causes, as they have a tendency to outline ecosystems, climatic zones, and even conventional apply areas. The determine beneath, made with QGIS, exhibits elements of two watersheds (the areas delimited by the thick blue strains, with their names—Río Itata and Río BíoBío—in blue italic boldface) and the ability technology services (symbolized by blue diamonds):

In the identical approach that my relational instance confirmed two tables of associated information, this map exhibits two spatial datasets: watersheds and hydroelectric technology stations. The watersheds are represented as options which have spatial extent (or space) and placement; the technology stations are represented as factors, which have location solely. Both datasets embody attributes that assist establish every characteristic outlined within the dataset; for instance, the identify of the watershed or the quantity of energy the producing station produces.

Suppose this summarizing job requires figuring out how a lot energy is generated in every watershed. One approach to do that is to undergo the producing station dataset and assign a worth to every document that factors to the watershed that comprises the purpose. I can perform this job manually as a result of I can observe which factors lie inside which watersheds. This is fairly laborious. But, on condition that the 2 datasets already outline the spatial nature of the watersheds and factors, and on condition that QGIS can learn this data and generate a map, can QGIS work out this relationship for me? 

Installing QGIS

In order to make use of QGIS, it have to be put in. The variations provided in numerous distros’ repositories will be fairly outdated, to the purpose of being unable to load numerous helpful plugins due to incompatibilities between the plugin dependencies and the libraries provided by the distribution (e.g., Qt libraries).

There are sometimes newer alternate options to those within the repos. For instance, each Fedora and Ubuntu supply GIS initiatives that incorporate all types of helpful spatial evaluation instruments. Another various is to obtain it from the QGIS site (as I am writing, each the brand new long-term launch, model three.four.four, and the earlier long-term launch, 2.18.28, can be found). In my expertise, it is higher to pick the brand new long-term launch to keep away from issues much like these within the older variations within the repositories. However, presently a lot of the web content material out there for QGIS references the older QGIS 2 variations, and it could take some puzzling to find out learn how to accomplish issues in QGIS three.

I downloaded the most recent long-term launch from the QGIS website and began it up:

Getting the info

I am utilizing publicly out there watersheds information from Chile’s Ministry of National Assets’ (Ministerio de Bienes Nacionales) IDEChile web site on the web page Cuencas Banco Nacional de Aguas (Watersheds National Water Bank). To get the info, I click on on the button marked Descargar to obtain the info as a .zip file and extract the file (Cuencas_BNA.zip) right into a folder.

The hydroelectric producing plant information is offered from Chile’s Ministry of Energy (Ministerio de Energía) IDEEnergía web site ranging from this web mapping page. On the left-hand facet of that web page beneath the phrase Overlays, I click on on the hyperlink “Centrales Generación Eléctrica” to develop the menu beneath, which affords a number of extra hyperlinks, together with “Hidroeléctricas.” I right-click on that hyperlink to deliver up a sub-menu that features Export to SHP:

When I click on on that hyperlink, it opens a type:

To activate the obtain, I have to fill within the high a part of the shape with my first identify (Nombre), my surname (Apellidos), my e mail deal with (Mail), and my cause for downloading (Motivo); I selected “Investigación.” When I click on on the Descargar button within the lower-right nook of the shape, the info arrives as a .zip file (Hidroeléctricas.zip), which I extract right into a folder.

Getting able to do the spatial be a part of

The first step in my evaluation is to load the layers into my Layers window (lower-left in Fig. three) by utilizing the highest menu to decide on Layer > Add layer > Add vector layer, which opens the info supply supervisor:

Before I’m going in search of the info, it is value explaining why I selected a vector layer and never one of many different choices. Vector information, in GIS-speak, are spatial entities represented by factors, strains, or polygons (together with just a few different specialised information varieties). Other sorts of spatial information exist; most notably raster information, which has similarities to pictures however incorporates different data similar to the situation of the raster on the Earth’s floor. This article offers a simple introduction to this terminology.

My hydroelectric producing plant information is modeled as factors with attributes (similar to energy generated, plant identify, and many others.). It is saved in shapefile format, considered one of a (very giant) variety of vector codecs in use in the present day. Note that the info supply supervisor above defaults to File format (right for shapefiles). My encoding is about to Latin1, which (in my expertise) is the commonest character encoding for shapefiles. Clicking on the button subsequent to Vector Dataset(s) opens a file browser that I can use to seek out my shapefile. Note that shapefiles are literally teams of information with totally different three-letter extensions: .shp for the shapefile geometry information, .dbf for the attributes, and so forth. To add my layer, I choose the XXXX.shp file; QGIS is aware of to affiliate the opposite information with this to create the total layer. Finally, as soon as the supply is recognized within the information supply supervisor, I can click on the Add button on the backside proper; my layer is added to the Layers panel, and the factors seem on the spatial view panel:

Similarly, I can add the watersheds, that are polygons:

Unfortunately, the watershed polygons (in inexperienced above) cowl the plant positions, making for a reasonably ugly map. QGIS renders the layers in last-first order, so to get the watersheds beneath the plant positions, I may drag the watershed layer down. I can even right-click on every layer, which provides me quite a lot of choices to vary the names proven within the layer display screen, change the properties (e.g., symbolization, labeling), open the attribute tables, and so forth.

I need to have one thing extra nice to have a look at, so I:

  1. Add the OpenStreetMap layer (right-click > OpenStreetMap within the browser window);

  2. Rearrange the layers in order that crops are on the high and OpenStreetMap is on the underside (drag the layers to rearrange);

  3. Change the names proven on the layers (right-click > Rename);

  4. Change the symbology so the watershed polygons are clear with a thick blue line and crops are blue diamonds (right-click > Properties > Symbology);

  5. Label the watersheds in blue italic textual content (right-click > Properties > Labels);

  6. Zoom in a bit (utilizing the zoom instrument on the toolbar).

Here’s the end result:

OK, now that I can stand to have a look at the map, I will transfer on to the calculations.

Doing the spatial be a part of

To perform the spatial be a part of, use the highest menu’s Vector > Data administration instruments > Join attributes by location, which brings up the next dialog field (I stuffed within the values I need within the fields provided):

I chosen the crops because the Input layer and the watersheds because the Join layer to place the watershed data onto the plant information. By choosing intersects because the Geometric predicate, I calculate the polygon with which every level intersects or, since we’re speaking factors and polygons, the polygon inside which every level lies. I take advantage of a prefix of “ws” for the joined-on attributes and specify a ensuing output layer (as a shapefile on this case):

After I dismiss the dialog field, I can take a look at the attributes for the Joined layer by right-clicking on the layer and choosing Open attribute desk, the place I can see the joined-on attributes:

Finally, what concerning the abstract of hydroelectric energy by watershed? If I had been utilizing a spreadsheet program like LibreOffice Calc, I might use a pivot desk to perform this job; actually, that is doable—I can open the .dbf file for my Joined layer.

But QGIS has plenty of evaluation instruments. If I open the Processing Toolbox utilizing the highest menu’s Processing merchandise and looking for “group,” I can see in Vector Analysis the instrument Statistics by classes:

If I double-click on that instrument, I see:

I’ve chosen the “POTENCIA” area (i.e., the hydroelectric potential in megawatts) for calculating statistics. I’ve outlined the sphere “wsNOM_CUEN” as containing the classes I need to summarize in opposition to. Clicking OK on the class selector after which Run creates a brand new attribute desk layer, Statistics by class, on which I can right-click and choose Open attribute desk:

For instance, I can see that the Bueno River watershed (“Río Bueno”) has a sum of 185.516MW of hydroelectric potential developed.

And that is it.  To evaluation, we have: realized learn how to set up QGIS; realized a bit about vector geospatial information and geometric operations on such information; created a spatial be a part of; and analyzed the outcomes of the spatial be a part of utilizing one thing like pivot tables.

Most Popular

To Top