Running sklearn functions in Grasshopper

Hi,

I am trying to run functions from the sklearn library within Grasshopper. For this I am using the compass.rpc

The following test code with the numpy library works:

SC1

However, it does not work with a function from the sklearn library:

image

This code gives the error message: “No module named sklearn.ensemble”

What is the problem here?

the proxy doesn’t actually allow you to import functions from packages that don’t exist in GH. it just delegates calls to such functions to a command server running externally and takes care of the serialization of input and output for you.

so the first example correctly creates a proxy for numpy and let’s the proxy delegate a call to numpy.array via the command server.

the second example tries to import a class from scikit-learn.ensemble into GH via the proxy, then create an instance of that class in GH, and call a method of the instance. this is simply not possible.

for something like this to work, you would have to make a wrapper function that does this for you and can be called instead.

see also here for more info

Thanks for your reply!
Ok, so I am guessing it is also not possible with the Xfunc?
You suggested to create a wrapper function. Can you maybe elaborate on that, so I can know where to look for this.

no, and this has nothing to do with limitations of RPC and/or XFunc (which is just an old version of RPC). CPython-specific packages like numpy or scikit-learn can simply not be imported in IronPython 2.7…

about the wrapper.

it is in general not a good idea to use RPC for individual calculation steps, because there will be too much accumulated overhead from constantly converting between, for example, the array data of numpy on the server side and the equivalent Python lists on the Rhino/GH side, and from serializing and unserializing that data on both sides of the wire to be able to send it across.

what makes more sense is to bundle such calculations in an algorithm which runs entirely on the server and to which only 1 call has to be made.

for example, compas.numerical.fd_numpy computes the equilibrium of a network represented by vertices and edges, and a few other parameters, using a combination of numpy-based calculation steps.

although all of these steps could be executed individually via RPC, it makes more sense to make just one call to the entire algorithm.

from compas.rpc import Proxy

proxy = Proxy('compas.numerical')

...

result = proxy.fd_numpy(vertices, edges, ...)

btw,

with the next release of COMPAS you will be able to do the following:

# e.g. /Users/username/Desktop/my-sklearn-funcs.py

from sklearn.ensemble import RandomForestClassifier

def predict(X, classification, random_state=0):
    clf = RandomForestClassifier(random_state=random_state)
    clf.fit(classification[0], classification[1])
    return clf.predict(X)

# Grasshopper

from compas.rpc import Proxy

X = ...

with Proxy('my-sklearn-funcs', path='/Users/username/Desktop/') as proxy:
    result = proxy.predict(X)
    print(result)