Subscribe to this thread
Home - General / All posts - This program will never make it - but why I use it anyway! (long post, sorry)
artlembo


3,156 post(s)
#01-Apr-21 16:33

The This program will never make it thread was getting a little too focused on a few specific issues, and while I think there is really good discussion in it, I wanted to open this new thread to address some of the issues I've been thinking about as it relates to 9, and where it fits in to people's solution. Hopefully, others can discuss how they are using 9, and how it fits within their solution set.

First things first, what I call the donut hole

I like 9. It's kinda cool to work with, but like many others have said, it is still a work in progress with some limitations. Also, the interface isn't as polished as I might like.

Smaller data

The reality is, if I have a dataset with 2 million vector features or less (basically any county GIS data), Manifold 8 is still awesome - albeit, dated when it comes to reading in certain formats. QGIS is also very easy to work with if you have 2 million features. ArcGIS Pro has also repackaged things very nicely. All 3 of these products cut through the data really, really, nicely. So, personally, I won't likely use 9 to work with data sets under 2M, as there are just more mature products that get the job done. This is where I think most people live.

Insanely large data

Someone once told meif it fits on your computer, then it's not big data. I tend to agree, but as I deal with GIS professionals as an educator, I like to tell them if it's more data than you've ever worked with, then it's big data to you. But, for argument sake, let's go with the first scenario, massive amounts of data.

In reality, QGIS, ArcGIS Pro, and Postgres aren't going to cut it, and neither are any of the versions of Manifold. To deal with this, you likely need a server, with lots of clients. And, in many cases, you need to spin up something like Hadoop to make it work. I've been there, done that, and never want to go through that experience again. I'm no rocket scientist, but I'm no dummy, either. However, to stand up a Hadoop cluster took many days. But, it got the job done. In reality, with my 40 seat lab, I could probably just break the data up into 1/40th of its size, put it on each computer, and then run Postgres on it over a weekend (getting my exercise by running to each computer to hit the run button!).

I don't see many people living in this ecosystem, and those that do, likely get paid a lot of money to spin up servers.

Very large data - the donut hole

Many people fall into this donut hole - they don't have the insane amount of data that requires Hadoop, but the data is way larger than a few million objects. This is where 9 really shines. I don't know the breaking point yet, but let me give you an idea of how I just used 9:

I'm doing some work with the International Union for Conservation of Nature Red List of Threatened Species, and we are looking at worldwide monkey population and their relationship to mangroves.

The mangrove data is 330 billion pixels in size (yes, billion). It is a binary data source, so it is rather ridiculous to store the data like that in a raster, but such as it is. It turns out, there are 61 million mangrove points that we can extract from the raster.

The monkey polygons are not particularly a large data set, but the number of vertices are insane - oh, my goodness, it's huge! Really intricate, sinuous polygons.

We needed to determine the number of mangroves in each of the monkey polygons, and we had to buffer each monkey polygon by different amounts.

After working with the data for months, the researchers came to me, because they knew I sort of swim in this ecosystem. 9 was almost perfect for this. Almost, and hopefully, what comes next could be some good additions to the program.

I'm going to be writing a formal paper on the methodology, but the following is what I did:

1. Convert the 330 billion pixel raster to 61 million polygons (9 struggled with this, but ArcGIS Pro was able to do it).

2. Convert the polygons to point centroids in 9. I didn't even need to write code for this, just used the GUI.

3. Used the Join dialog to count the points in the polygons (more on this below).

4. 17 minutes later, the process was done! My colleagues have been running their same command for 3 days.

Getting the overlay to work

Just like the other GIS products, 9 couldn't really run the process after 24 hours. So, I used a little trick in PostGIS called ST_Subdivide, and turned the 928 monkey polygons into 472,000 polygons with no more than 20 vertices. You can see my logic for subdividing the data here. After completing the join, I knew how many points were in each subdivided polygon, and simply did a sum and grouped by the monkey polygon id.

That was it. I can't tell you how much data is actually there, but after everything was done, I had used up over 100GB on my computer (some data is duplicated in Postgres, and some in 9, and some in a geodatabase). But yes, a good amount of data that.

So, 9 is a perfect tool for that donut hole, converting many days of processing to under 20 minutes. And, even turning many months of batting around ideas to about 3 days (I experimented with a few options, so that's what took me 3 days to get my head around this).

In this case, 9 was the only way to blast through this data without going DEFCON V with Hadoop. So, for $95, you have a screaming software product to deal with lots of data and intricacies.

Things I'd like to see Manifold add:

1. a really good raster to vector (point or polygon) tool to convert pixels values to points (the current tool could not get through all the data in any reasonable amount of time).

2. a ST_SubDivide command. This is really critical if you are going to work with big data. Sometimes you have to partition the data to give the software a chance. I'd add to that, the ability to create tessellations like hexagons because they are often used to partition a large empty space.

One comment about the Help Manual:

The biggest Con with the help manual is searching for what you are looking for. It's terrible in my opinion. I think a .chm provides so much greater flexibility. I would hope they could produce something like that. I truly hate the search function on the browser and the index tab.

The biggest Pro with the help manual is the detailed information. I never appreciated it more than when I worked on this project. The examples are really great and detailed, and I had never used the Join function before, and taking only 10 minutes to read that section and the example of counting cities inside of states was all I needed. I'm a reasonably intelligent person, and the help manual treats you as such - it is written with a lot of detail so that if you've never used a function before, they will step you through it rather nicely. But, they do expect you to read it. And no, I didn't read hundreds of pages. But, I did read the pages I needed to, and took that aspect seriously (BTW, I'm the guy who when fixing a lawnmower doesn't read the directions, I jump right in - that won't work with 9, you really do have to read it, and all the info you need is right there).

My apologies for such a long post, but I hadn't really seen much about how people are actually using 9 in the wild, and where it fits in. Hopefully other can elaborate on their own use-cases.

adamw


9,552 post(s)
#01-Apr-21 17:12

Thanks! Very interesting!

For 1, extracting vector points from raster, are we talking about Trace? Were points single-pixel or small clusters of pixels? Trace extracts areas, so if you needed points, there was a lot of unneeded computations that could have been eliminated. We could probably have an option to convert small areas into points, this would have helped the performance, likely significantly.

For 2, ST_SubDivide, we hear you. We are also planning to make it less necessary to run things like ST_SubDivide on large geoms - although if you are willing to deal with merging the results back, it'd still make sense to run it.

In general, our focus is on making 9 a much better fit for your "Smaller data" scenarios. That's what we are concentrating on currently.

Thanks again, a great write-up.

artlembo


3,156 post(s)
#01-Apr-21 20:52

extracting vector points from raster: as I mentioned, the raster is binary (1/0). So, any pixel with a 1 we'll turn into a point, and ignore the 0s. Therefore, I wasn't interested in areas, but rather individual pixel centroids. I was attempting to use TileGeomToValues. But, the 330 billion pixels was just too much to deal with. I even wrote a script to cycle through the st_subdivide geometries so break it off into smaller chunks, but to no avail.

tjhb

9,627 post(s)
#01-Apr-21 22:14

Art, I'm curious, did you try a workflow that did not convert the raster(s) to centroids?

I don't know the details, but perhaps you could just decompose the monkey polygons into triangles, or into convex parts (a familiar step to both of us), then Join those smaller areas to the original rasters. (Then sum.)

Raster storage is massively efficient, especially in Manifold 9, and the Join dialog is flexible enough to allow raster-vector analysis without translation. Power.

Tim

artlembo


3,156 post(s)
#01-Apr-21 22:46

Good advice. And yes, I did attempt to do that. The storage, as you say, is very efficient. I think the problem is when you need to unpack the data to find out which pixel actually falls in an area. It’s the classic compression/decompression problem.

I think the join command is awesome. But, having 330 billion pixels really does clog the pipes up a bit :-)

Nonetheless, I hope you are as impressed as I am that the actual overlay got done in 17 minutes. I tried to do the overlay in postGIS with the subdivided polygons and it was taking hours and hours so I just killed the process. 17 minutes, wow.

tjhb

9,627 post(s)
#01-Apr-21 23:01

Yes I am just as impressed as you are. For two reasons. (1) You made something impossible into something possible. That's helpful! (2) At 17 minutes, you can afford to experiment or to tweak. At many hours to days, you have to get the thing right once, or you're either stuffed (if it didn't work) or quite grumpy (if it worked but suboptimally).

In both cases, a difference not just in degree, but in kind.

adamw


9,552 post(s)
#02-Apr-21 14:38

OK, so just to recap:

We have a raster of 330 billion pixels (that's short scale, right? 330 * 10^9?). In your example they are either 1 or 0, but I presume we can expect something like INT8 or INT16 and maybe conditions like BETWEEN 200 AND 300.

We need to turn each pixel that satisfies the condition (in your example, equal to 1) into a geom.

There are 61 million of such pixels in total (that's from the first post).

Using TileToValues (TileGeomToValues was likely a typo, right?) is too slow.

Correct?

61 million geoms is manageable, so if the numbers above are correct, a reasonably fast way to do this seems totally within reach.

artlembo


3,156 post(s)
#02-Apr-21 15:06

yes, you have that mostly correct. More specifically:

the raster: 1,295,779 x 256,004

and yes, mine is such a limited use/case that I think it is reasonable to expect that BETWEEN clause for other uses.

TiletoValues: I actually used TileGeomToValues, and yes, that was too slow. That gave me the X,Y, and Value. I did not actually try the TileToValues. I think TileToValues would not work well on 330b pixels - it seems to first want to eat the elephant in one bite. I thought TileGeomToValues would be better because we are narrowing the area that we are searching.

Also, in this case, the data is very sparse - only .0004 of the pixels have a value we are interested in. So, I'm guessing there are tons of tiles that have nothing in them. I'm not sure if that helps any.

adamw


9,552 post(s)
#03-Apr-21 17:10

I did some experiments and, you know, using TileToValues isn't too bad, but yeah, there's room for improvement.

My test images were: small (441 million pixels), large (44.1 billion pixels, it's just the small image increased 10x by both X and Y). Your image is 7.5x larger than my large one, but my large is likely large enough to make things scale linearly, so we can just multiply times for my large 7.5x and that would produce some estimates for your image. I was creating points for pixels with a value that occurs slightly more frequently than yours (1.5x or so), but perhaps close enough.

At first I started with the simplest query I could write:

--SQL9

SELECT GeomMakePoint(VectorMakeX2(tx*128+x, ty*128+y)) AS [geom]

INTO [geom202] FROM (

  SELECT x AS tx, y AS ty, SPLIT CALL TileToValuesX3([tile]FROM [small]

WHERE value = 202;

This finished in 615 seconds. In the top-level query, TX is the X coordinate of a tile, X is the X coordinate of a pixel within the tile, 128 is the tile size, so TX*128+X is the X coordinate of the pixel in the whole image, same for Y.

I tried optimizing the query on the small image before running it on the large one (who has time to run unoptimized first versions on full data), and filtered out tiles which have no pixels with the desired value:

--SQL9

SELECT GeomMakePoint(VectorMakeX2(tx*128+x, ty*128+y)) AS [geom]

INTO [geom202_2] FROM (

  SELECT x AS tx, y AS ty, SPLIT CALL TileToValues(tileFROM [small]

  WHERE TileValueMin(tile) <= 202 AND TileValueMax(tile) >= 202

WHERE value = 202;

This finished in 95 seconds.

I pushed computing TileValueMin / TileValueMax to threads, etc, this got the time down to 68 seconds, but the query became large so I just kept the one above that finished in 95 seconds. Ran it on the large image, it finished in 3448 seconds. I expected it to take longer, but there are various economies of scale which let it take just 36x as long as the query for the small image while the amount of data increased 100x (this happens, too, unfortunately a lot of it stops after we run out of the cache and with the large image we are already out of it, there likely won't be any such savings as we increase the amount of data further).

OK, so this means that for your 330 billion pixels image we are probably looking at something like 7 hours. Given 1 byte per pixel, the size of your image is 330 GB + something for masks (if all pixels are visible, this is negligible, so let's not count it). This is a lot of data, just reading which is going to strain the disk. The ballpark figure for how much data we would normally expect to read + analyze per unit of time is 50 MB/sec - obviously, modern disks can provide data at a faster rate, but data isn't always coming sequentially and there are all kinds of delays during processing, so 50 MB/sec is a quite fair number. With that, ideally we'd expect to process 330 GB in about 2 hours. So, there is a gap between 7 hours (expected from the query above) and 2 hours (ideal).

We think we can reduce this gap or maybe even close it completely with a couple of small additions and extensions to SQL. We'll see.

tjhb

9,627 post(s)
#03-Apr-21 19:21

Small anomaly? TileToValuesX3 in first (simplest) query.

Also, I'm interested in the combined prefilter

WHERE TileValueMin(tile) <= 202 AND TileValueMax(tile) >= 202

which would not have occurred to me.

Can you explain why, even though it is a composite test (scanning each tile twice?) it is so efficient?

And then... how about swapping the functions/conditions to

WHERE TileValueMin(tile) >= 202 AND TileValueMax(tile) <= 202

and omitting the outer WHERE clause, redundant now because 202 will uniquely satisfy the prefilter?

...Well, I think I can partly answer myself.

TileValueMin() <= x and TileValueMax >= y will both short-circuit, i.e. return true as soon as they find any value satisfying their conditions. They will only scan the full tile if the condition either fails or succeeds only on the very last value.

But TileValueMin() >= x and TileValueMax <= y will have to scan the full tile (twice) in all cases.

So this swap is a very bad idea!

But I'm still interested in why your dual pre-scan is efficient. I wouldn't have thought of this.

adamw


9,552 post(s)
#06-Apr-21 13:56

TileToValuesX3 should be TileToValues. (One of the tests was operating on a 3-channel image, taking one channel out using TileChannel, I mixed up things by mistake.)

TileValueMin ... AND TileValueMax ... does scan the pixels of each tile twice (once for function), but this happens in memory, which is fast. What would have ruined it is if the data set, which resides in a mix of memory and disk, was being scanned twice, but it only has to be scanned once.

I couldn't use WHERE TileValueMin(tile) >= 202 ... because in my test, the tiles contained a mix of values on both sides of 202, so if a tile contained pixels with values of 150, 202, 250, the test would have thrown it away - while I needed it to stay because 202 is there. Yes, anything in AND can short-circuit. Keep in mind that the query engine can re-arrange or re-write conditions, though, if it thinks it can improve things (we mostly improve the blatantly obvious).

adamw


9,552 post(s)
#06-Apr-21 14:15

A small addition: scanning the pixels of each tile in memory twice improves the performance of the query because doing this is faster than converting each pixel into a record, then iterating on these records outside of the functions in the query engine. Converting anything to records - even virtual ones - and iterating on them is not free.

artlembo


3,156 post(s)
#03-Apr-21 22:57

I’ll give this a try tonight, and report back what I find.

artlembo


3,156 post(s)
#04-Apr-21 01:08

Adam,

I’m willing to bet that it will go a little bit faster than you think. The reason is, you are computing information about 330 billion pixels. But,a majority of the tiles are all zeros. So hopefully, 95% of the image will just be ignored and you’ll only be processing the small portion of the globe that has the mangroves in them. I don’t know if some of those statistics are already calculated about the tiles or not. If so then yes, you are probably going to have to search through 330 billion pixels. But if they are predetermined with summary statistics then they can simply be ignored and none of that data needs to be processed.

tjhb

9,627 post(s)
#04-Apr-21 02:04

Yes. So the tile prefilter you will apply is just

WHERE TileValueMax([tile])

right?

artlembo


3,156 post(s)
#04-Apr-21 04:02

Exactly. I will be heading into lab until the afternoon, so I’ll let you know how it turns out

artlembo


3,156 post(s)
#05-Apr-21 14:08

so this is interesting - it took 87 minutes. Again, that is likely due to the fact that most of the tiles are empty. So, 87 minutes is certainly acceptable to me. But, here is another issue: it extracted 121 million points. So, it extracted double the points. I'm not going to see if 9 brought in too many points, or if the person who sent me the points from ArcGIS gave me too few.

danb


1,745 post(s)
#05-Apr-21 20:35

I am not sure if this is relevant to your case, but one time when I was putting together a multistep workflow involving large binary mask grids, I found it beneficial to do a pre-scan using (I think) TileCompare to identify all tiles that matched my JSON definition of a blank tile (there are probably better ways to do this).

I added an additional column to the image table and set an attribute to mark tiles that I could safely ignore from all further steps of the process and this made a big difference to the performance of the workflow as a whole.


Landsystems Ltd ... Know your land | www.landsystems.co.nz

tjhb

9,627 post(s)
#05-Apr-21 20:51

Nice!

Besides TileCompare, you could equally use TileMax or TileMin I think? That might be faster.

danb


1,745 post(s)
#06-Apr-21 00:11

Thanks for the pointer. This does make more sense thinking about it.


Landsystems Ltd ... Know your land | www.landsystems.co.nz

adamw


9,552 post(s)
#06-Apr-21 14:06

Well, I am not sure how the query with TileToValues can extract duplicate points. It's pretty straightforward. I would guess the extraction performed in ArcGIS was not 1 pixel = 1 point, perhaps it was merging points for small clusters of pixels. That's just a guess though.

Anyway, I can report that we did manage to eliminate the gap between the ideal processing time (based on IO rate) and the actual processing time entirely.

Here's a modified version of the query that you will be able to write in the upcoming build:

--SQL9

SELECT GeomMakePoint(VectorMakeX2(tx*128+x, ty*128+y)) AS [geom]

INTO [geom202] FROM (

 SELECT x AS tx, y AS ty,

   SPLIT CALL TileToValues(TileMaskRange(tile, 202, 202, TRUE, TRUE), FALSE)

 FROM [large]

);

There are two changes: (1) the new function called TileMaskRange takes a tile and a range of values (here, min=202, max=202, so just 202), checks for each pixel whether its value falls into the range, then turns that pixel invisible depending on the result (here, pixels with values falling into the range are kept visible and all others are turned invisible), (2) the new argument to TileToValues skips reporting invisible pixels.

With these changes, the time to process the large image (which used to be 3448 seconds) drops to 1091 seconds. Which is pretty much the lower bound expected for processing 50 GB of data (1000 seconds).

No threads needed as this is already bound by IO as it stands.

artlembo


3,156 post(s)
#06-Apr-21 15:47

thanks. This is great, I look forward to trying it out. I ran the overlay with the 121M points, and the overlay worked in 24 minutes (up from 17 minutes with 61M points). So, this is plenty fast.

I decided to rerun the query with the original monkey polygons (as opposed to the ST_SubDivide version). It has now been over an hour, and it is still processing. So, the ST_SubDivide was a big win in this case.

tjhb

9,627 post(s)
#06-Apr-21 18:09

GeomToConvex(Par) would probably do just as well as ST_Subdivide? And might even be faster with appropriate use of threads?

artlembo


3,156 post(s)
#06-Apr-21 18:32

maybe. I’ll test it out. It’s just nice to be able to define the number of vertices.

artlembo


3,156 post(s)
#06-Apr-21 19:20

still running after 4 hours, so I killed it. Therefore, ST_SubDivide is definitely necessary. I'm going to try Tim's suggestion next about Convex Parts. We'll see how that works. It creates many small polygons, but it also has some really large ones, as well. So, if I were a betting man, I'd say they should be close. Hopefully, I'll have an answer in about a 1/2 hour.

artlembo


3,156 post(s)
#06-Apr-21 20:18

doing the overlay with the ConvexParts took 34 minutes vs. 24 minutes with the ST_SubDivide. Given that the problems was virtually unsolvable without splitting the polygons, ConvexParts is perfectly acceptable - that is, what's 10 minutes? And, it doesn't require you to move data in and out of PostGres to perform the ST_SubDivide.

Nonetheless, I still think for big data analysis, ST_SubDivide and the ability to create a hexagonal grid are important tools to have.

tjhb

9,627 post(s)
#06-Apr-21 20:28

Excellent! Well said.

If we had the equivalent of ST_Subdivide I would certainly use it as well, especially on contour-derived and landcover data.

I'm glad that convex parts is comparatively not too bad though! Good test Art. This whole thread increases community knowledge.

danb


1,745 post(s)
#07-Apr-21 00:37

I'd always been laboring under the impression that something mathematically magical happened regarding point in polygon operations as soon as convex polygons came into play. Is this not the case? Is simply splitting complex areas into many simpler parts enough to gain additional performance?


Landsystems Ltd ... Know your land | www.landsystems.co.nz

tjhb

9,627 post(s)
#07-Apr-21 01:11

I think ST_Subdivide (also) guarantees convexity. All the resulting parts are convex I think? [No.]

So the magic applies here too.

[I am wrong. Concavity is allowed in the result of ST_Subdivide.]

tjhb

9,627 post(s)
#07-Apr-21 02:15

It would be interesting to kmow the maximum number of vertices Art specified for ST_Subdivide (and how sensitive this was).

tjhb

9,627 post(s)
#06-Apr-21 21:05

Both parts of

(1) the new function called TileMaskRange takes a tile and a range of values..., checks for each pixel whether its value falls into the range, then turns that pixel invisible depending on the result..., (2) the new argument to TileToValues skips reporting invisible pixels.

are going to be fantastically useful. Every day, brilliant.

As a general question, how much slower would it be to use explicit string arguments (which can be visibly meaningful) rather than Boolean flags (which require reference to the manual)?

I would often prefer the former if they were not significantly slower.

(At school--referencing the other great recent thread--we used not punch cards but the kind you could block out with a graphite pencil. Face to face with Turing, a great way to learn as Ron said. Just not sure we still always need Booleans as args.)

adamw


9,552 post(s)
#07-Apr-21 10:03

Agree booleans aren't great, they are hard to distinguish from each other and it is hard to remember what TRUE means for each.

String parameters would be slow, however. We consider packing parameters into a JSON for big functions like Kriging, but doing this for small functions is too expensive.

There are two other options:

(1) Replace boolean flags with named integer constants. If there are multiple flags, combine them together using BITOR. Eg: GeomClip(geom, clip, GeomClipInner, 0). Pros: work right now, adding a new flag is better than adding a new parameter as the function prototype does not change. Cons: BITOR might feel too technical for SQL, there are tons of named constants, can use constants for function A to call function B.

(2) Extend SQL to allow calls with named parameters. Eg: TileMake(cx = 100, cy = 100, value = 5). Pros: all parameters can be annotated well, no restrictions on function design, works for user-defined functions. Cons: not traditional SQL, changing the name of a parameter breaks callers, parameter names become unlocalizable.

Not sure about any of them. (I mean, 2 is tempting, but seems too brittle with parameter names having to stay the same. No idea how this would work out in practice, could be bad.)

tjhb

9,627 post(s)
#07-Apr-21 13:25

Thanks for explaining that!

Not really worth improving on the current situation then.

It's not broken after all, just requires thought and memory and, well, checking. None of that is bad.

Thanks again.

Dimitri


6,511 post(s)
#07-Apr-21 14:04

Agree booleans aren't great,

Respectfully disagree. The best thing about booleans is that even if you are wrong, you're only off by a bit. :-)

danb


1,745 post(s)
#02-Apr-21 05:30

In general, our focus is on making 9 a much better fit for your "Smaller data" scenarios. That's what we are concentrating on currently

I am really glad to hear this is the case Adam. I love the power of M9 for carving up large datasets, but I would estimate that I probably have an 80:20 split between smaller tasks and those involving large datasets with millions of geometries.

While I appreciate that you are likely having to design for for potentially massive datasets, it would be nice if we could gain some of the fluidity of M8 workflows for those more mundane GIS tasks or those where there is just a bit more data than M8, Arc or Q would normally be comfortable with. For me it is mostly about getting familiar with my data as a workflow develops, rapid filtering selection, looking for things that don't look quite right and of course ViewBots which I seem to mention quite a bit


Landsystems Ltd ... Know your land | www.landsystems.co.nz

danb


1,745 post(s)
#02-Apr-21 05:21

You can see my logic for subdividing the data here. After completing the join, I knew how many points were in each subdivided polygon, and simply did a sum and grouped by the monkey polygon id.

I use this strategy quite regularly decomposing complex polygons to convex parts. In fact the attached screenshot shows an example from just yesterday where I am trying to build the equivalent of ArcMap's Eliminate function. I need to see what class the polygons are adjacent to the small polygons to be eliminated (purple). If decompose the larger polygons to convex parts first the adjacency test is much much faster.

Attachments:
Clipboard-1.png


Landsystems Ltd ... Know your land | www.landsystems.co.nz

Dimitri


6,511 post(s)
#02-Apr-21 14:31

The biggest Con with the help manual is searching for what you are looking for. It's terrible in my opinion. I think a .chm provides so much greater flexibility. I would hope they could produce something like that. I truly hate the search function on the browser and the index tab.

100% agree. The advice to use Google was good, but inconvenient.

I'm super happy to see that as of today the user manual for 9 has an embedded Google search box at the top of every page and the index/search tabs are gone. The search box searches only the user manual for 9. You can use all Google search syntax in the search box.

Depending on the browser that you're using, make sure to clear cached images and files, so your browser won't try to use old versions of the table of contents frame, etc. In the new Bing browser (a really super browser), that's in their Settings - Clear browsing data sub page.

Google's very good, but the search box will pick up only what is crawled by Google, so it might be a few days before it picks up very new changes. It also depends, of course, on having an Internet connection. But I think that's a fair tradeoff for better quality search.

artlembo


3,156 post(s)
#02-Apr-21 15:13

yes, you are right - as of today, the search bar does work much better. Could've used that yesterday

Let's give this another week or so, and see how well it works. This could be sufficient.

BTW, was that change always being planned, or was it in response to what was mentioned earlier?

Dimitri


6,511 post(s)
#02-Apr-21 15:39

That was something always planned, but moved up in priority based on community feedback. :-)

joebocop
454 post(s)
#02-Apr-21 19:38

Do you mean that sufficient Suggestions were received, following the instructions, or that company representatives reading the forum felt compelled?

hphillips31 post(s)
#02-Apr-21 16:00

It took a couple tries using the Google Search box in the online manual to realize one has to scroll down past all the Ad results (a screenful) to get into the search results relevant to Manifold

adamw


9,552 post(s)
#02-Apr-21 16:04

What were the search terms? The number of ads should be the same as if the search was done from the main Google site.

dchall8
847 post(s)
#02-Apr-21 18:13

I searched for layouts and got 4 ads above the Manifold returns.

hphillips31 post(s)
#02-Apr-21 18:20

'Join', 'Topology' A direct Google search from Google gives me definitions at the top, not ads like the search from the Manifold online help. But once past the ads, the searches are all totally Manifold relevant and useful through the Manifold Help entry point.

Attachments:
join.png
topology.png

Dimitri


6,511 post(s)
#02-Apr-21 18:36

Interesting.... I just tried 'Join' and didn't get any ads at all. Likewise for 'topology' or layouts.

The algorithms Google uses to show you ads are hard to predict. It depends on who they think you are, where they think you are, what the time of day is, your personal browsing history, what you've clicked on in the past, what you've watched on TV (if you have a Google-mediated TV setup), what you've watched on YouTube and many other factors, like what you buy online.

Earlier today I saw more ads, but now I can't get them to show at all, even trying search terms you'd think would trigger ads, like

lose weight (no quotes), or real estate or vacation or chocolate.

Just out of curiosity, I Googled

How do I remove ads from Google search results?

... and at the top of the results page was...

About 695,000,000 results (0.71 seconds)

hphillips31 post(s)
#02-Apr-21 19:14

No ads now for searches on join, topology or layouts, just Manifold results when searching from Manifold online Help. Hopefully it will stay that way!

drtees101 post(s)
#02-Apr-21 23:01

I recently ran across a similar situation. I have a LiDAR data set containing 2,985,866,401 pixels and I needed to create a basin for a wetland using the Watershed command (yes, this is M8). M8 loaded up the file with no issues, but I know from experience that the data contain a lot of sinks. I use the fillsinks transform to fix that problem first. Once the sinks are filled, M8 is pretty speedy at creating the watersheds and streams. However, most of my experience is with a small subset of a larger LiDAR file. Why do an entire file when you only need something that is perhaps a half mile a side.

Back to the LiDAR file; I had M8 start the process of filling the sinks and it went to work. After 136 hours, it was only 19% done and probably would be crunching the data for another couple of weeks (I needed to pause M8 a couple of times).

While M8 was working, I was curious to see what M9 could do with the same LiDAR dataset. In the amount of time (minus a weekend) that I had M8 running, M9 filled the sinks and created several watershed layers, taking no more than one and three quarter hours per run. Most of those runs were to try expanding the minimum flow values to reduce the number of polygons generated. Basically M9 did several iterations of watershed areas and lines while M8 was still tying its shoe laces. I will be exporting the data back to M8 because the presentation tools are much better than the current version of M9.

I am looking forward to using M9's muscle on future projects. I am also thinking that I will need to upgrade my desktop computer. Two Xeon processors with two logical cores each appeared to be straining my computer resources. My final experiment will be to run M9 on my Mac (six cores) using Parallels to run Win10. Parallels will only take four of the cores. I also stitch together large numbers of aerial photos taken from our drone and learned that my Mac could complete that job long before my office desktop could.

Manifold User Community Use Agreement Copyright (C) 2007-2019 Manifold Software Limited. All rights reserved.