Subscribe to this thread
Home - General / All posts - thoughts on more multi-threading
artlembo


2,916 post(s)
online
#05-Dec-17 22:10

As I sit here, waiting for a database of 12M points to import in, I have some time to write this question. Personally, I'd rather be doing some low level work in MF, like checking out how many objects are in a particular drawing, or starting to write an SQL query. But, when importing a database, or other tasks, I'm not able to do other things (currently, my SQL query creates a new data source, does a SELECT INTO, ALTERs the table, and then creates two drawings).

Is there no way to allow something like that to run in the background? This prevents me from doing other work for the next 15 minutes.

tjhb

7,545 post(s)
#05-Dec-17 22:30

Dan made exactly the same point in offline conversation recently. So can I add our two votes?

If import could be made a background process, that would be fantastic.

The most important resource is always the user.

antoniocarlos

471 post(s)
#05-Dec-17 22:41

Why is not opening a second instance of Future and linking to the one that is running the import not an option?


How soon?

tjhb

7,545 post(s)
#05-Dec-17 22:57

(a) Because it is not possible to add a .map file as a new data source, if it is already open in another session. ("This file is in use.")

(b) In any case, import as a background process would be better.

antoniocarlos

471 post(s)
#05-Dec-17 23:36

I had never tied (a) with Future but figured it was something that with Future would be allowed. Absolutely (b) is better.


How soon?

Dimitri


4,332 post(s)
#06-Dec-17 05:21

text

Agree with (b), but until that happens using a second session is realistic because imports can take time while .map files open instantly.

Suppose you are in the middle of doing work and you want to import something like Art's big job. Launch a second session of Future (effortless... right click on the Manifold logo in the task bar and one click on Manifold) and import. Save that .map file and close.

Next, link that .map into your first session. Opens instantly. Or, copy / paste between sessions to get your working files together into the same session.

The above is not a bad way to go with larger imports in any event, since one often wants to have them converted into .map format as standalone resources, to be linked into whatever project uses them.

Like I say, the (b) scenario is necessary regardless, but until then the above is an option.

dyalsjas10 post(s)
#05-Dec-17 23:44

This sounds like a use case for a shared datastore; you're loading data to into Postgres with one instance of MF9 and updating another layer in Postgres with another MF9 instance.

adamw


7,307 post(s)
#06-Dec-17 09:11

We can perform the import in background no problem in that there are no big technical limitations, everything is thread-safe, etc. We don't do this because we don't want to expose components that are in the process of being built (fields / indexes / records / metadata) to the user. Half-created components will likely refuse to work as they should, but what's even worse is that once they accumulate enough data from the import to actually work, the user might start making changes to them and that's unlikely to end well.

Preventing the user from accessing half-created components is difficult enough as it is (it is not enough to just let the UI know that a component is in the process of being built, this has to be carried to scripts and queries = to the data level), but suppose we solve it. What should the process look like? You start the import and it goes into some pane, fine. When the import finishes, there will be some notification and the results of the import will be fully available. But what happens in the middle? Should the UI not display any of the imported components until the import completes (that's fine, but there are all sorts of weird things like the user creating a drawing named "States" and getting "States 2", because a background import took the name "States" already)? Should a second import into the same MAP file be allowed to start concurrently with the first (again, that's fine, the data will survive, but the name clashes between the imports will be resolved weirdly)? What about the export - if you start an import and then try to export the MAP file, should the export not see any of the imported components because the import did not yet complete when you started the export (the import could complete mid-way through the export, but presumably this should not make the imported components available to the export, because the export already started)? In fact, shouldn't export also be done in background? But if it does go into the background, perhaps all components and the database should be made readonly for the duration, right? Otherwise you would be able to delete a component and it would be unclear whether the exported file will have it or not.

From what we see, the issue is mostly about the user experience quickly becoming too complex and quirky.

Maybe it would make sense to allow converting data to MAP file in background and then linking the result as a data source - that stops introducing components into the opened MAP file (only adds one for the linked data source and does so at the very start) and avoids all of the above issues.

tjhb

7,545 post(s)
#06-Dec-17 10:11

I think the last para is the right idea. The experience would then be closely analogous to what happens when we add a remote datasource that takes a little while to become fully populated. The datasource icon would start as greyed out, the new datasource should be inaccessible until import is complete. (The point is that in the meantime we can still work on items in the project root, or in other child datasources.)

After import we can move data to the project root if necessary (or just move or copy some of it, possibly filtering by query). If the import is huge then it makes good sense to leave it in the child datasource for good organisation (as a kind of folder with autonomy).

I think it should still be possible to import directly to the project root—?—and that this should stay modal, to avoid clashes and conflicts as you say. So that a background import (to a new child datasource) would be an option, a new checkbox in the Import dialog. Clear choice, clear difference.

I think it’s less important for export to go in the background. Usually the user’s time is more precious, and multitasking much more important, near the start of work than when the project is finished. (That doesn’t always apply, but usually.)

An export with a background import in progress should just omit the new (pending) datasource.

adamw


7,307 post(s)
#06-Dec-17 11:36

...then again, if we are talking about starting the import and having it go into a separate MAP file and then adding that as a data source, isn't that just a weird form of linking?

We understand the desire to put a long operation into the background so that you can do something else while it completes there. It just seems that some of the traditional operations like import are a bad fit for that because they are not isolated from each other conceptually. That's perhaps similar to how single-threaded code frequently has to be rethought and redone in order to use multiple threads, just this time we see this with the UI processes.

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.