Subscribe to this thread
Home - General / All posts - Mfld 9 image import slow and with "blurred" pixels
philw
96 post(s)
#15-Nov-18 02:50

I have noticed that images are very slow to import to M9 and are loading with up to 4 times the file size of the original image. I get the later may be due to decompression. I am importing .jp2 files from the ESA Sentinel 2 satellite data. As an example I just loaded a 99MB .jp2 file = 58s and a 76MB = 1m42s. In M8 and QGIS it is pretty much instant to load these images (a second or two).

Also a maybe unrelated issue is that images in M9 are blurred when you zoom into them and look washed out compared to the M8 or QGIS view. M9 seems to interpolate between the pixels when you zoom in rather than show the original pixels as with M8/QGIS. Is there a way to stop this behavior?

Running a i-7 8550 with 16GB RAM 256 SSD +1TB HDD. Restarted to see if this would change the behavior. It didn't.

I also tried a large (1.5GB) SHP/DBF file to see if it was a read speed issue but M9 = 157s and M8 = 191s. QGIS was much faster (~5s to open the visual - stream network for Australia) and another 45s to open the attribute table.

danb


1,668 post(s)
#15-Nov-18 04:06

For compressed images such as jp2 try linking rather than importing, this should be instant. With regards the interpolation between pixels, I remember something from way back in the beta about bilnear or bicubic interpolation or some such, but would also prefer to see the pixels when zoomed in close.


Landsystems Ltd ... Know your land | www.landsystems.co.nz

adamw


8,259 post(s)
#15-Nov-18 07:25

It was probably this: 9 is currently using bilinear interpolation, this blurs pixels at high zooms, this is a reasonable default, but sometimes the blurring is not desired, we do not currently have an option to control the interpolation mode and we should add such an option.

adamw


8,259 post(s)
#15-Nov-18 07:22

Like Dan says, you likely want to link JP2 files instead of importing them.

JP2 files are JPEG2000. JPEG2000 is a format specifically designed to achieve super-high compression ratios as well as provide means to extract parts of the image with desired resolution without decompressing the whole image. If you link such a file in 9, we will extract parts of the image necessary for the display as you pan and zoom = exactly the thing that the format was designed to support. If you import such a file in 9, we will decompress the entire image retrieving all pixels. Since the compression ratio is high, the resulting image stored without compression will likely be much larger than the original file and the time to write such an image will likely be high.

Regarding blurriness, what is the scale? If the scale is higher than the native scale (you can go to native scale by invoking View - Zoom to Native), that is, one pixel of the image is expanded into more than one pixel on the physical screen, then we understand where blurriness might come from and have some ideas regarding how to make it better. But if the scale is lower than the native scale, especially significantly lower, then it would be helpful if you provided example screen + file, either here or in a report to tech support.

Dimitri

5,119 post(s)
#15-Nov-18 16:12

I also tried a large (1.5GB) SHP/DBF file to see if it was a read speed issue but M9 = 157s and M8 = 191s. QGIS was much faster (~5s to open the visual - stream network for Australia) and another 45s to open the attribute table.

Manifold's format is .map - that's super fast. As noted in the introductory topics, it takes a while to import but once in .map nothing is faster.

Q leaves data in shapefiles, which are not as slow as many people say. But still, they are not remotely as fast as .map.

You can see that in the videos at https://www.youtube.com/watch?v=R1G11KtJ2EU and at https://www.youtube.com/watch?v=h2kB_mEatew ... the latter compares Manifold speed to PostgreSQL, which is much faster than Q with shapefiles.

That Australian data set was chosen for the comparison because it was a big theme on reddit a while back. The creator of the series of posters based on that data set said in an interview (with the New York Times, I believe) that his biggest problem was the slow response and constant crashing of the software he used, Q.

If you compare 9 running that data set in .map, the right way, to Q running it in shp, with the the usual zoom box, panning, zooming and such, you'll see 9 is dramatically faster, from instant open to far faster display and access to data.

The key idea is to use .map. Don't use slow formats except as interchange. Plus, there are the numerous brain-dead limitations of shapefiles to consider if you leave your data in shapefiles. The bottom line is that leaving the data in shapefiles is missing an opportunity to gain speed plus modern ability to work with data.

adamw


8,259 post(s)
#16-Nov-18 07:25

I also tried a large (1.5GB) SHP/DBF file to see if it was a read speed issue but M9 = 157s and M8 = 191s. QGIS was much faster (~5s to open the visual - stream network for Australia) and another 45s to open the attribute table.

I somehow missed this the first time.

To add to what Dimitri says, 157 seconds you cite for M9 is for importing data. If you want M9 to do what QGIS does, you can - that's linking = telling the system to use data in the files as is, performing necessary conversions and caching on the fly.

Dimitri

5,119 post(s)
#17-Nov-18 17:40

Regarding linking vs. importing:

I launched QGIS 3.2 and opened the WatercourseLines shapefile that is all lines, from the Australian Hydro data set that became so famous on reddit. It is a 300,000 KB .shp and a 2,171,000 KB .dbf, with about 1.3 million lines in it.

Q starts displaying something right away, as it loads objects from the shapefile (shapefile used to be Q's "native" format) but it took 58 seconds to load the whole thing. So it seems...

(~5s to open the visual - stream network for Australia)

… was just the first five seconds for it to start loading the shapefile, with another 50 seconds or so to go to actually load the thing.

When linking the shapefile, that is, leaving the data in the shapefile, 9 gives you the option of caching locally to get around some of the worst wretchedness of shapefiles, or not caching. With the cache box checked, it took 43 seconds for Manifold to load the whole thing.

With cache on in 9, everything happens almost instantly: pans and zoom box to various areas typically take less than a second. With Q, it's the usual ten seconds or more for redisplay, with 9 being roughly ten times faster.

With cache off, 9 is still very fast. As I mentioned in my post above, despite all the backwardness of shapefiles, the format is not as slow as people think. Q does a good job with it, as expected given that the vector structure of Q for over ten years was based on shapefiles as a native format. 9 also does a good job with shapefiles, but the strategy of 9 is to try to cache and otherwise use the infrastructure of the Radian engine wherever possible to get more speed, even when the data is left in shapefile format.

I guess the bottom line is that if you do an apples-to-apples comparison, leaving the data in the shapefile (your only choice in Q, and using File - Link in 9), 9 is slightly faster to load the data (43 seconds vs. 58 seconds) and much faster (approximately 10x) in using the data thereafter.

If you do an apples to oranges comparison, leaving the data in shapefile in Q and importing it into Manifold .map format with 9, then you have a longer, one-time import with 9 but after that everything is instantaneous. One aspect of that instantaneous opening of .map files is the ability to nest them and to make instant saves. You also get away from the truly stone age limitations of shapefiles on data types.

Last, but not least, with .map you are not limited to the relatively small data sizes allowed in shapefiles.

I'd therefore recommend taking on the one-time import and saving the data in .map. You can always export it quickly enough to .shp should you need to do so for interchange.

philw
96 post(s)
#19-Nov-18 02:52

Thanks for your efforts investigating this guys. I did know about SHP being the (old?) native format in QGIS and assumed that it was probably linking the data so future processing/transforming of the data would likely take more time than in M9. I really only did the test between M8/9 and QGIS to see if it would shed any light on why M9 was taking so much longer to load the .jp2 files (i.e. was it with vector formats as well). Turns out using a SHP file probably was not the best idea, should import something into QGIS that requires it to convert it to its native format, as Manifold is having to do.

I have had a response to the "bug" report from Manifold and the story is consistent from them... not a bug with .jp2 import but likely a problem with the product formulation at the producer end (i.e. ESA). Waiting on FTP details to send a file to them to check the exact issue.

Dimitri

5,119 post(s)
#19-Nov-18 03:16

why M9 was taking so much longer to load the .jp2 files

You can either import jp2 files or you can link them. If you want to do an apples to apples comparison, then link them using File - Link. That takes zero time in 9 and the contents of the file will be displayed instantly as well. If all you want to do is look at a jp2, then just link the thing. No need to import it, unless you want to take advantage of the other benefits of .map format.

jp2 is like ECW... it is a compressed format designed to display quickly, at the cost of (usually) being lossy. jp2 gives you good display speed by fetching only what you are looking at, which is just a fraction of the data.

Suppose you have a 20GB jp2 file. You can't show 20 gigabytes worth of pixels on a computer monitor. You can only show a MB or so, so what the jp2 does is just show you a top-level, interpolated view. It's only showing 1 MB, not the full 20GB. Great for looks and utterly useless for actual data analytics.

As you zoom further into the view, only those pixels needed to populate what you see on screen are generated, always only a tiny fraction of the full 20GB. So, sure, it's fast. For viewing.

It may or may not make sense to import a jp2 into a project. That takes longer (of course) because Manifold is extracting every last byte from the file, decompressing the whole thing, all 20GB if the data set is that big, converting it to non-lossy internal image storage format, and making all of the data available for analytics. Given the lossiness most jp2 files cause by compressing the data, what you get from the jp2 may not be useful for analytics. But at least it will be in .map format and thus fast to use.

Take a look at the Import or Link ECW example topic. It's the same deal as jp2. You should probably also review the initial topics on getting started (talks about time to import, etc.) and the Importing and Linking topic.

tjhb

8,410 post(s)
online
#19-Nov-18 06:28

Your posts have been precisely on the nail here Dimitri (if I may say so). I was hoping someone would say something like this. You have said it better.

I would just add (something you have said but mostly left implicit): no one should use JPEG2000 or ECW images for analysis. Not ever. They are no longer data, only handy data visualizations.

Lossless JPEG2000 might be an exception (after import), I don't know. I have never tried a round trip. It might work, though I would never trust it (not as me, but especially not as a client).

Data means pixels.

tjhb

8,410 post(s)
online
#19-Nov-18 06:56

This might be worth adding.

ECW and JPEG2000 images don't contain pixels.

They contain the relation between one pixel and the next (and the next...), stored as a wavelet (a tiny equation).

That is, they contain a samplng of local data variation, more or less approximated, but no actual data.

philw
96 post(s)
#20-Nov-18 03:53

Bit silly of the ESA to spend over 6B euros on a program that produces useless data really Given that they supply their data as .jp2 and over the last few years we have done hundreds analyses on them including NDVI, NDWI, turbidity, chlorophyll etc. it seems to work for our purposes. We have been using QGIS to convert the .jp2 files to Geotiff (.tif) files for analysis in M8. Just wanted to see if we could drop that step in M9.

The SNAP Sen2cor processor for atmospheric correction (L2A) also outputs .jp2 files. I am assuming that they are fit for purpose.

Would be interested in hearing from other M8/9 users about their workflow for Sentinel satellite data.

I have attached an example of a (low res) .jp2 band data from S2A in case anyone wanted to look at the format.

Attachments:
T51LWC_20170401T015651_B10.jp2

Dimitri

5,119 post(s)
#20-Nov-18 07:57

Bit silly of the ESA to spend over 6B euros on a program that produces useless data really Given that they supply their data as .jp2 and over the last few years we have done hundreds analyses on them including NDVI, NDWI, turbidity, chlorophyll etc

Looking at pretty, detailed, satellite pictures is not useless. Every well-run bureaucracy understands that, and they also know that for every person interested in and capable of mindful analytics there are thousands of people who just want to "eyeball" a scene as viewed from space. So, if they use lossy compression to make that easier for the vast majority of their "stakeholders," that's smart marketing to the people who vote them billions of euros.

As for analytics, the key question is: Do they publish the data in jp2 using lossy compression? If so, when you do analytics on that you are not working with the full resolution of the original data. You are working with smeared (that is, averaged) data. For many purposes, that's OK. You might not be working at full resolution, but if your results are pretty, then the people who are consuming those results will just say, "wow, this is beautiful." They probably will not say, "gosh, he did an analysis on jp2, so he's not really reporting data at the full ESA resolution but just something that's been averaged down by lossy compression to less than full resolution."

The SNAP Sen2cor processor for atmospheric correction (L2A) also outputs .jp2 files.

Almost certainly the sensor does not output in jp2. If by "processor" you mean whatever ESA does to re-package the original data for their website, well, then they might be translating it into jp2.

philw
96 post(s)
#20-Nov-18 10:47

Fully understand what you are saying about lossy compression etc (understood that before this conversation started). The ESA Sentinel data is much more than eye candy as seems to be suggested. There is a huge community of users conducting serious research using all sorts of analysis of the multi-spectral data. Sure, some people just eyeball it and that is probably great for the true colour imagery. The other 8 bands are (I would think) almost always used for some sort of "raster maths" type analysis. The very point of the Sentinel satellites is for global change analysis (numerically calculated). As far as I can see the only format you can download the raw data in is .jp2. This seems to be the native format for distribution to all of the various tools that use the data, including the SNAP tool produced by ESA. The Sen2cor is a software processor to do bottom of atmosphere (BOA) corrections on the LC1 TOA reflectance data, which is what is typically offered in the distribution databases. Although, the L2A BOA data is becoming much more available as pre-processed data now.

The SNAP tool does output the BOA data as .hdr (ENVI IMG version) on request but I guess it came from the .jp2 data to start with so probably no advantage in that for import to M8/9 with respect to data quality. I will start using that format though as it seems to import with the correct projection/datum info which the .jp2 does not.

Dimitri

5,119 post(s)
#20-Nov-18 14:51

The SNAP tool does output the BOA data as .hdr (ENVI IMG version) on request but I guess it came from the .jp2 data to start with so probably no advantage in that for import to M8/9 with respect to data quality

The key question is whether they use lossy compression for the data they provide in jp2 format. Do they do that?

If they use lossless compression, it really doesn't matter what format you use, as all lossless formats should provide the same data.

As far as I can see the only format you can download the raw data in is .jp2.

A quick reading of the ESA Sentinel pages indicates they do not provide the raw data to users. It appears they only provide processed data. Their raw data I would expect almost certainly does not use jp2. For example, from the Sentinel 2 MSI technical guide:

Level-0 is compressed raw data. The Level-0 product contains all the information required to generate the Level-1 (and upper) product levels.

[...]

Note: Level-0, Level-1A and Level-1B products are not disseminated to users.

I don't have anything for or against various image formats. I'm just noting that jpeg2000 can compress data using either lossy or lossless. If lossy compression is used, that's not good for analytics but often fine for looking at images.

adamw


8,259 post(s)
#20-Nov-18 08:58

We have been using QGIS to convert the .jp2 files to Geotiff (.tif) files for analysis in M8. Just wanted to see if we could drop that step in M9.

If you want to analyze pixels in JPEG2000 files (this might make sense if the compression was lossless or if the loss of data due to compression is accounted for in the analysis), then yes, with 9 you don't have to go through TIFF, you can just import JPEG2000 directly. The import will take the time proportional to the size of uncompressed data, but that's the same with the conversion to TIFF and then importing that, however instead of uncompress JPEG2000 - write TIFF - read TIFF - write MAP, you just do uncompress JPEG2000 - write MAP, the latter is obviously faster and uses less disk space.

tjhb

8,410 post(s)
online
#20-Nov-18 22:14

Global Mapper metadata for Phil's example JPEG2000 file T51LWC_20170401T015651_B10.jp2 says that its target compression ratio is 1:5.

If that is correct, then it is not lossless but (mildly) lossy. Some local variation between source pixels will have been smoothed away (and can't be recovered on decompression).

For analysis, it would be better (and different, and more natural) to use the ENVI IMG alternative source instead. The lossy JPEG2000 source is meant for viewing.

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.