Subscribe to this thread
Home - General / All posts - Speed of our own functions
artlembo


2,916 post(s)
online
#02-Dec-17 02:43

If MF doesn't have a function, we can create our own, using a script. That is very cool. But, what kind of hit will we take in terms of processing speed?

For instance, I want to quantify all taxi rides by the day of the week, but MF doesn't have a day of week function. So, I created a VBScript, and call it from SQL:

-- VBSCRIPT --

Function GetDay(indate)

  GetDay = WeekDayName(WeekDay(indate))

End Function

Sub Main

 Manifold.Application.Log "Hello, World!"

 Manifold.Application.OpenLog

End Sub

-- SQL --

FUNCTION GetDay(x DATETIME) varchar

AS SCRIPT [FunctionLib] ENTRY 'GetDay';

SELECT count(*), day FROM

(

SELECT GetDay([pickup_datetime]) AS day 

FROM [taxiSandy] 

)

GROUP BY  day

so, in the above script, GetDay calls the VBScript code that returns the day of the week.

It takes a long time, but then again, I have 13M records. Just wondering if calling the script slows things down.

tjhb

7,545 post(s)
#02-Dec-17 03:10

A .NET scripting language will be faster, no unnecessary type conversions.

VB.NET, C#, IronPython...

artlembo


2,916 post(s)
online
#02-Dec-17 03:19

Yes, and .NET is compiled vs. interpreted. But even with a .NET script, will we take a hit on performance?

tjhb

7,545 post(s)
#02-Dec-17 03:23

Compared to straight SQL? No. They should be equally fast. Often the nearest .NET equivalent will be faster than the best SQL.

But any ECMA language will be noticeably slower, because of conversion to and from System.Object.

(IronPython is effectively interpreted—i.e. dynamic. But still fast because it uses native .NET types. Manifold 9 types are all blittable.)

tjhb

7,545 post(s)
#02-Dec-17 04:10

Also I don’t think ECMA languages can be called from multiple threads—or more accurately, parallel calls must be serialized. Whereas .NET calls can remain fully parallel.

adamw


7,307 post(s)
#02-Dec-17 14:41

It depends on the operation. The heavier the script function, the less the overhead. But even for small script functions, the overhead isn't that big.

Consider the following example query:

--SQL9

FUNCTION F(value INT32INT32 AS StringLength(CAST(value AS NVARCHAR)) END;

SELECT Sum(F(value)) INTO t_2 FROM CALL ValueSequence(1, 1000000, 1);

The operation (F) is tiny, that emphasizes the overhead heavily. We just convert an integer value to a string and then measure the length of the string, and do this for a million numbers.

I created two other versions of it separating the function into a script in C# or VBScript like this:

--SQL9

FUNCTION F(value INT32INT32 AS SCRIPT [s4_vbscript];

SELECT Sum(F(value)) INTO t_4 FROM CALL ValueSequence(1, 1000000, 1);

Script:

'VBScript

Function F(value)

  F = Len(CStr(value))

End Function

 

Sub Main

End Sub

Here are the performance numbers from the log:

Query: [q2_function] (2.578 sec)

Query: [q3_csharp] (3.767 sec)

Query: [q4_vbscript] (3.382 sec)

As you can see, the overhead is not big. (And VBScript apparently won over C#, usually it goes the other way.)

If we inline the function, we get this:

Query: [q1] (1.250 sec)

...so the overhead for calling script functions is comparable to that for calling other query constructs.

See attached MXB.

Attachments:
sql-script-performance.mxb

adamw


7,307 post(s)
#02-Dec-17 14:52

Actually, I found most of the reason why VBScript won over C# - I just used a more complex operation in C# than is necessary: string.Format("{0}", value).Length. If we simplify this to just: value.ToString().Length, the time for the C# version is on par with VBScript.

This does not matter much to the point of the post.

(And yes, I would expect IronPython to have numbers in the same ballpark and have the same small overhead.)

The long and short of it is: if the performance is bad, it is probably not because you are calling a script function.

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.