Subscribe to this thread
Home - General / All posts - Linking Large ADF in Manifold 9
mikedufty

804 post(s)
#04-Apr-18 05:13

I have downloaded a very large (67GB) adf dataset of Australian elevations which I would like to link into Manifold 9.

I have tried linking the hdr file, and after 1 hour the import appeared to have completed, but I can't do anything, just get "invalid heap" messages.

Any ideas what might be going wrong?

Is import the best approach for this? I only need read only access.

Would it be better to add as a datasource? or is that just the same thing via a different menu.

My main aim is to be able to browse and export manageable chunks of this dataset to use in Manifold 8 more efficiently than downloading bits from Geoscience Australia.

Dimitri


4,980 post(s)
#04-Apr-18 06:36

Any ideas what might be going wrong?

Not without *way* more information on details. You know the drill: What version of 9 are you using? Tell us about your computer system... free disk space, where the TEMP folder is located, size of pagefile, etc, etc. Is the download on the same machine? On the same hard disk? Is the data accessed through a network on a archival disk shared by several local machines? etc., etc.

Do you have a link to the data? Have you contacted tech support? What did they say, if you did contact them?

9 is generally very good at working with big data on even small machines, but a) it can make mistakes, which can be found only with lots of detail, and b) it cannot work miracles, such as enabling use of very large data on machines that do not have enough free space to work with big data. So,... to get to interesting causes of problems we should begin by collecting all details necessary to exclude obvious issues.

Would it be better to add as a datasource? or is that just the same thing via a different menu.

In general, yes. But it is not always the same thing.

Best of all would be to import into Manifold and to then save that .map. Link that .map into any future projects you need that will use that data. Why use .map? Because it is a faster format than .adf.

adamw

8,061 post(s)
#04-Apr-18 07:53

The number one thing to sort out is the 'invalid heap' messages and we need more data to figure out where they come from (eg, is the data set vector or raster? is it linking or importing? you are saying 'linking' in the title, but then mention that the 'import' appeared to have completed). Please contact tech support.

Some time to pre-process a large data set would be expected. If the data set is raster, we have to build intermediate levels, and if the data set is vector, we have to build a spatial index. In both cases we need to read the entire data set, this is going to take time. More, if we are talking about linking, it is important to then let the data source save composed data by answering 'Yes' to the 'Save data?' prompt when you close or save the MAP file - otherwise the data source will have to recompute it in future sessions. But, again, the most important thing now is to figure out where the 'invalid heap' messages come from - could be wrong data in the file, could be our issue, etc.

mikedufty

804 post(s)
#04-Apr-18 09:53

It is a raster dataset - surface elevation derived from SRTM 1 second data for the whole of Australia

DEM-H dataset from here:

https://ecat.ga.gov.au/geonetwork/srv/eng/search#!aac46307-fce8-449d-e044-00144fdd4fa6

It is saved on a network drive.

Running Manifold 9.0.166.0

97GB free on SSD

I used the file menu - import.

It then displays, I think "copying data" for about an hour.

Really just experimenting with 9 at the moment, but this seemed like something it would be good for.

Might try an import overnight.

The dataset works fine with QGIS, just a bit slow, though not an hour slow.

Dimitri


4,980 post(s)
#04-Apr-18 10:40

The dataset works fine with QGIS, just a bit slow, though not an hour slow.

Manifold .map isn't slow. It tends to pop open instantly. From the third paragraph in the Importing and Linking topic:

Importing large files can take a long timebecause the imported data will be analyzed and stored in special, pre-computed data structures within the Manifold file that allow subsequentreads and writes to be very fast. It pays to be patient with such imports as once the data is imported and stored within a Manifold project file access to that data will usually be far faster than it was in the original format. Once imported the data will open instantlythereafter..

So yes, the initial import can be slow. After that it is very fast, often instantaneous.

Be that as it may, our task is to find out why you are getting an error. One obvious thing to do is to simplify the situation to exclude causes of error that have nothing to do with Manifold. For example:

It is saved on a network drive.

OK. Networks often have errors. Do all your work locally to eliminate the possibility of network errors causing symptoms. If everything works perfectly locally but doesn't when you start using your network, it can make sense to take a look at your network.

97GB free on SSD

I don't recall if ADF is a compressed format. If it were a 67 GB TIF you wouldn't have enough free space to save a big project on that drive. See "Must Have Free Space on Disk" in the Performance Tips topic.

It also helps to have the info I mentioned earlier. You don't say anything about your pagefile, TEMP folder, etc.

About the data: your link opens to a page with many links on it. The DEM-H link is only 26.8 GB. Is that it?

adamw

8,061 post(s)
#04-Apr-18 16:59

If 97 GB is all space that is available for an import of a 67 GB ADF (= all space available for converting ADF to MAP, which involves creating additional data for intermediate levels as well as for MAP file structures), then this is very likely not enough space and we are looking at an 'out of disk space' error.

This (not enough space) might have an effect on the performance of the import as well.

The rule of thumb is having the temp space be 3 times the size of imported data, before accounting for compression. This tries to cover data which might have to be added plus one temporary copy for the save.

mikedufty

804 post(s)
#06-Apr-18 02:59

Probably need to stick with QGIS then for this dataset.

Dimitri


4,980 post(s)
#06-Apr-18 05:16

Why? So you can run slow with it every time you use it? Is that not a case of "penny wise and pound foolish"?

Why not figure out how to import into Manifold, if possible, save as a .map and then thereafter every time run fast?

I write "figure out" because sometimes the different workflow used by different tools needs to be learned including how to deal with their particular constraints or requirements. That's often very positive because it can open more efficient methods, and sometimes, what appears to be a hassle initially can save much time in the long run if it points out a weakness in other workflow or infrastructure that could be holding you back in other areas.

An example of that might be (just saying "might" as you need to be on the scene to know) a limitation of 97 GB of free space. That is an absurdly small amount of free space if you are working with data sets where a single one of them is 67 GB in size. Thinking "well, I can get around it this once using this slow tool to squeeze by..." may certainly be OK on a one-time basis, but only until the next "one-time" rolls around. :-)

Many of us have had that feeling, of knowing we are working with too little free space for the size data we routinely manipulate, but feeling a need to just get by with the task right now. When you start having to think about "do I have enough room for this next file?" on disk, well, that's a sign the low amount of free space has to be dealt with. Move out what you don't need or get more space.

In my own work I often make copies willy-nilly so I have backups, and backups to backups. The result? The other day I realized on my primary workstation I had about 50 GB free space left despite having plenty of terabytes in local storage plus connections to effectively limitless archival storage. Doh. I had been wasting time moving files about to open up enough free space for new, big data I was acquiring.

So I invested some time doing spring cleaning, organizing more rational archives, consolidating copies of copies that had proliferated and what do you know... 2 TB free space!

Cannot resist adding... even at that, 2 TB is nothing. Heck, a 2 TB SSD is a mere $410 these days. Visiting newegg.com just now and taking a look there are plenty of 8TB drives at prices around $230. Buy a 12TB drive for $450. Having vast storage space is one of the cheapest, yet most effective investments you can make.

mikedufty

804 post(s)
#06-Apr-18 08:08

Why? Because it is not so easy to retrofit a large hard drive in a laptop, and I might only use that dataset once a year or so, so in this case it is not worth fiddling with. I'm sure there are other applications Manifold 9 will be good for.

Dimitri


4,980 post(s)
#06-Apr-18 10:08

it is not so easy to retrofit a large hard drive in a laptop,

Ah, I admire your ambition. :-) But, you know, the above is why tech support asks about systems before offering advice, debugging, etc. It helps to know if somebody is running 32-bit or 64-bit, working on a laptop, etc.

But... since you've raised the question it seems only right to follow up a bit on this for the benefit of other forum users.

Are you running 32-bit or 64-bit on your laptop?

How much RAM memory do you have in your laptop?

What version of Windows are you running?

---

In the "for what it is worth" department, consider upgrading the drive in your laptop. I've done that many times in various laptops and, so far, it's always worked perfectly and has been much easier than I thought it would be.

Consider that even if you do this only once a year or so, are you going to leave a big data set on a laptop if space is tight? I tried downloading the link, the one I asked about:

The DEM-H link is only 26.8 GB. Is that it?

... and the site is very slow, reporting about 9 to 12 hours for the download. The download was interrupted when I tried two days ago, so I've just launched it again. The site choked on both Chrome and Opera, so I had to use Edge, which is not as good as either Chrome or Opera for restarting an interrupted download. If something takes nine hours to download, why ever repeat that?

mikedufty

804 post(s)
#09-Apr-18 07:46

64 bit windows 10 pro, 12GB of RAM, only intel graphics.

The dataset is saved on our file server, so no need to download it again. The download link is correct, it is only 26.8GB because it is zipped.

Putting a big drive on the laptop is possible, but since there is space for only one I would lose the benefits of an ssd.

Keeping data on the laptop would prevent others in the office from getting to it. We would also have to implement backups for the laptop.

We only ever need small, catchment size chunks of the data to use at any time, so QGIS works well to export a suitable chunk to work with. Having it in manifold 9 format would probably work even better, but not enough so to bother for this particular dataset, particularly now as we only have one 9 licence to try out for the moment, so the other 16 manifold users would not be able to take advantage. I am not following up with support because I don't actually need to get at that data now, I just thought it might be a useful one for Manifold 9 to show how good it is.

mdsumner


4,205 post(s)
#09-Apr-18 11:46

(I also appreciate it btw, it's a magnificent data set to have on tap - thanks!)


https://github.com/mdsumner

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.