LiDAR processing - It's not black magic

By Robert A. Fowler

"My name is Bill Johnson. I recently read your article in EOM. I have come close to using LiDAR in the past, but can't get past a couple of "mind hurdles." I figure now is as good a time as any to ask questions.

First of all, I realize you can get points every ten feet, but what about curb lines, bridges and major breaks? I do not see how these can be accounted for without a definitive break-line. Without them, how can you accurately show contours in those areas?

Second, with the massive amount of points that come back, how do you determine which ones are good and which ones need to be thrown out. Doesn't the laser bounce off anything in its path, such as a bird, a leaf or a raindrop even? Additionally, if you fly a heavy urban area, I would think the process of editing the points would drive you nuts. How much is involved in removing points from mailboxes, fire hydrants, poles, automobiles and so on."

These are good questions, Bill, which incidentally LiDAR operators get asked often.

You are right, LiDARs do not collect break-lines. Depending on the density of points this can be a problem if break-lines are what you want. Most of the LiDAR systems today are set up to work over land using an infrared beam. The exceptions are hydrographic LiDARs which use a different part of the spectrum. Infrared beams tend to be absorbed by water and show very weak or no returns when they hit water. So over water bodies there are what are called data voids. As these areas are usually pretty obvious to the processing technician, a "fence" can be put around the lake or river and subsequent processing of contours will not go through the lake. However, that is about the extent of easy break-line interpretation with LiDAR data.

However, LiDAR is a different technology so let us step back and look at this from a different perspective. I always ask people what they do with contours. They are, after all, only a visual aid - nobody (or to cover myself I should say very few people) actually uses them for anything. All engineers convert them to a Digital Terrain Model (DTM) - which is what a LiDAR gives you directly. The density of points which LiDAR provides gives most engineers far more information than they normally require.

That being said, because a LiDAR is a machine which captures data at a specific time interval (in our case 5000 times per second), the collection of data points is somewhat random. By that I mean the system will not necessarily collect points exactly where you want them all the time. So for example, we cannot guarantee that the LiDAR collecting data along a road corridor will necessarily hit the bottom of a ditch next to the road. It may, or it may hit the side of the ditch, or if it's a very narrow ditch it may miss it altogether. As the system scans from side to side, on the way back it may just hit the bottom of the ditch on the next pass, but again not necessarily.

So if you are a client, how will you know if there's a ditch? The short answer is, without some supplementary data, such as imagery, you won't. So there is an argument for additional standard air photo, digital image acquisition at the same time as the LiDAR data, or for LiDAR intensity data, although the latter is not necessarily a good answer for the above problem.(The reason being LiDAR intensity data are the result of the strength of the return signal. When the signal hits materials of different reflectivity the intensity feature records that difference. So, if there is water at the bottom of the ditch it will show with a different signal strength than the surrounding grass. However if the ditch is dry and the grass is uniformly present in the ditch, the intensity will be uniform also and the ditch will not be obvious.)

But, the newest versions of LiDAR equipment now collect data more times per second (the next announced system is expected to collect 50,000 points per second). Depending on flying height and speed, at these rates of kHz you will get data points very close together, and the chances of missing a ditch are considerably reduced. The down side, (isn’t there always one!) to these systems is the data files are so huge it is literally data overkill, and few clients have the computer systems to handle these sizes of files. This means the LiDAR operator has to decimate the files. Right now there is a dearth of intelligent software that will know the areas where important features (such as narrow ditches) need to be kept, compared to the areas where the terrain is more regular and can be decimated with no real loss of information. Software to do this intelligently is coming though.

If that all sounds somewhat negative, in fact it's not as bad as it sounds. Even at 5000 points a second there is a wealth of data collected and with digital imagery taken at the same time as the LiDAR survey you can have the best of both worlds. Many clients are developers or engineers who have to move dirt, and they need to know quantities. On average, the LiDAR does a better job of providing the base data than other technologies, particularly in vegetated areas. It is also faster, so if a project has a short time frame, LiDAR is the quickest way of getting data to a client.

Finally, if you want to visualize LiDAR data without having contours, there are a number of software packages which allow you to make a shaded relief from point data and this can be as effective, if not more so, than contours.

The second part of your question is very simple. I don’t actually do much LiDAR analysis, so it doesn’t drive me nuts at all! Of course, our technicians who work with the data might not feel the same way. OK, seriously: yes, the LiDAR will receive information from whatever it hits. If it does hit a bird then that is what you get. It is, however, unlikely to hit rain drops as LiDARs don’t work well in rain, snow, thick smoke or very thick haze or in clouds, so they don’t get flown in these conditions. But you are right, the beam can be bounced back from mail boxes, hydrants or a moving vehicle. We have even had the odd occasion when we’ve hit the top of a flag pole.

There are a couple of things, which help in this regard. First is the beam width. This varies by type of laser and system. Although lasers generate a very narrow coherent beam, the beam does distend over distance. So the beam that is a micron or so wide at source can be six inches to three feet wide by the time it gets close to the ground. As you can imagine, even at six inches wide, it is possible for the beam to hit a variety of things in succession. It could hit a small branch, several leaves part of the corner of the building, and then continue down to the ground. Most systems allow the operator to select what he wants to record. It could be the first reflection, the last reflection, both, or, on some equipment, up to five different reflections from a single pulse. Depending on the client, and the use of the data, the choice will vary.

However, as we are often dealing with gigabytes of recorded information, most clients will decide last pulse is quite enough, thank you. This is the one most likely (but not always) to be the most useful, as it usually indicates a solid object such as a building or the ground.

But, you are probably thinking, doesn’t the width of the beam have some effect on accuracy? Well, yes, although this is arguable. Wider beam systems do tend to average a little bit more, with some loss of accuracy. On the other hand, most of the energy returned tends to be from the center of the beam and there is a fall off towards the perimeter. But operators of the wider beam systems are usually flying very broad area projects where the highest accuracy is not so important.

However, the majority of the problem you mention is solved by software. Each LiDAR comes with software (or if it is a proprietary system, software is written for it,) which processes the complete data set to remove irrelevant data. How does it do this? While the workings of each package are carefully protected, the answer involves making numerous comparisons among all of the adjacent points.

To put this in terms easier to understand, think of what actually happens. Data are collected from everything the LiDAR hits: ground, buildings, trees, lamp standards etc. Now, imagine all of these data points in a 3D model, floating in space as little dots of light. And, in fact, there are a number of software packages which allow you to do exactly this. When you rotate the 3D model and view it from different angles, it soon becomes pretty obvious that a large number of the dots are on a "lowest plane" and the rest seem to float above it.

Because there are hundreds of thousands of data points, the processing software assumes the lowest level of points in the data set are the ground. It then assumes, logically, all of the stuff floating above the ground must be something else. When the floating dots are in a nice regular shape, it is also pretty obvious they are a building, a bridge or cultural feature of some kind. When they are irregular and sort of lumpy they are obviously trees or vegetation of some sort. I know this all sounds kind of hokey but it actually works and when you look at a LiDAR file with this sort of visualization software you can really see it happen.

But, still, you may say, how do we know that the lowest apparent point is the ground and not a lump of grass or an old cardboard box that someone threw out?

The answer to that is, we don't. No more than a photogrammetrist looking at a lump of grass will know whether it is grass or a real lump of dirt. But if the box is under a tree and the LiDAR misses the foliage and gets a return, it's still closer than the photogrammetrist's "guess" of the ground. (I spent a few of my years peering through binoculars into stereoplotters.)

So the LiDAR software has sophisticated algorithms which do a huge number of comparisons between points to determine more or less automatically which is ground, which is a building and which is most likely vegetation. Then there are the odd out-liers, such as the single point on a flag pole, a light standard or a bird, and these are assumed to be exactly that - out-liers and are usually discarded. When in doubt the software tends to remove single data points which appear above the ground, so the mail boxes, hydrants and flag poles are removed.

However as a quality control mechanism, Lasermap, like many others, operates its LiDAR with a GPS tagged video or digital frame camera, so in actual fact the processing technician can scroll through the tape or photo record and view what was beneath the LiDAR at any one point. This can usually help sort out problems where interpretation is not obvious using the data file alone.

So the answers come down to so much data being collected, and semi-intelligent processing algorithms making assumptions that are right 95% of the time. With analysis and interpretation by the processing technician, this rate typically goes to 98%-99.9%. But if the LiDAR does happen to hit a large rock hidden for the most part by a tree, it will probably be accidentally removed. And that would be wrong. However, let’s be fair: no technology is without some limitations. Even interpretation in photogrammetry is subject to individual experience and intuition. And if that rock was really under the tree, the photogrammetrist wouldn’t see it either.

First Published in Earth Observation Magazine (EOM) – October 2001

LiDAR - DATA COLLECTION AND SERVICES