Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

c - interfacing Python and Torch7(Lua) via shared library

I am trying to pass data (arrays) between python and lua and I want to manipulate the data in lua using the Torch7 framework. I figured this can best be done through C, since python and lua interface with C. Also some advantages are that no data copying is needed this way (passing only pointers) and is fast.

I implemented two programs, one where lua is embedded in c and one where python passes data to c. They both work when compiled to executable binaries. However when the c to lua program is instead made to be a shared library things don’t work.

The details: I’m using 64-bit ubuntu 14.04 and 12.04. I’m using luajit 2.0.2 with lua 5.1 installed in /usr/local/ Dependency libs are in /usr/local/lib and headers are in /usr/local/include I’m using python 2.7

The code for the c to lua program is:

tensor.lua

require 'torch'

function hi_tensor(t)
   print(‘Hi from lua')
   torch.setdefaulttensortype('torch.FloatTensor')
   print(t)
return t*2
end

cluaf.h

void multiply (float* array, int m, int n, float *result, int m1, int n1);

cluaf.c

#include <stdio.h>
#include <string.h>
#include "lua.h"
#include "lauxlib.h"
#include "lualib.h"
#include "luaT.h"
#include "TH/TH.h"

void multiply (float* array, int m, int n, float *result, int m1, int n1)
{
    lua_State *L = luaL_newstate();
    luaL_openlibs( L );

    // loading the lua file
    if (luaL_loadfile(L, "tensor.lua") || lua_pcall(L, 0, 0, 0))
    {
        printf("error: %s 
", lua_tostring(L, -1));
    }

    // convert the c array to Torch7 specific structure representing a tensor
    THFloatStorage* storage =  THFloatStorage_newWithData(array, m*n);
    THFloatTensor* tensor = THFloatTensor_newWithStorage2d(storage, 0, m, n, n, 1);
    luaT_newmetatable(L, "torch.FloatTensor", NULL, NULL, NULL, NULL);

    // load the lua function hi_tensor
    lua_getglobal(L, "hi_tensor");
    if(!lua_isfunction(L,-1))
    {
        lua_pop(L,1);
    }

    //this pushes data to the stack to be used as a parameter
    //to the hi_tensor function call
    luaT_pushudata(L, (void *)tensor, "torch.FloatTensor");

    // call the lua function hi_tensor
    if (lua_pcall(L, 1, 1, 0) != 0)
    {
        printf("error running function `hi_tensor': %s 
", lua_tostring(L, -1));
    }

    // get results returned from the lua function hi_tensor
    THFloatTensor* z = luaT_toudata(L, -1, "torch.FloatTensor");
    lua_pop(L, 1);
    THFloatStorage *storage_res =  z->storage;
    result = storage_res->data;

    return ;
}

Then to test I do:

luajit -b tensor.lua tensor.o

gcc -w -c -Wall -Wl,-E -fpic cluaf.c -lluajit -lluaT -lTH -lm -ldl -L /usr/local/lib

gcc -shared cluaf.o tensor.o -L/usr/local/lib -lluajit -lluaT -lTH -lm -ldl -Wl,-E -o libcluaf.so

gcc -L. -Wall -o test main.c -lcluaf

./test

The output:

Hi from lua
 1.0000  0.2000
 0.2000  5.3000
[torch.FloatTensor of dimension 2x2]

c result 2.000000 
c result 0.400000 
c result 0.400000 
c result 10.60000

So far so good. But when I try to use the shared library in python it breaks.

test.py

from ctypes import byref, cdll, c_int
import ctypes
import numpy as np
import cython

l = cdll.LoadLibrary(‘absolute_path_to_so/libcluaf.so')

a = np.arange(4, dtype=np.float64).reshape((2,2))
b = np.arange(4, dtype=np.float64).reshape((2,2))

l.multiply.argtypes = [ctypes.POINTER(ctypes.c_float), ctypes.c_int, ctypes.c_int,     ctypes.POINTER(ctypes.c_float), ctypes.c_int, ctypes.c_int]
a_list = []
b_list = []

for i in range(a.shape[0]):
    for j in range(a.shape[1]):
            a_list.append(a[i][j])

for i in range(b.shape[0]):
     for j in range(b.shape[1]):
        b_list.append(b[i][j])

arr_a = (ctypes.c_float * len(a_list))()
arr_b = (ctypes.c_float * len(b_list))()

l.multiply(arr_a, ctypes.c_int(2), ctypes.c_int(2), arr_b, ctypes.c_int(2), ctypes.c_int(2))

I run:

python test.py

and the output is:

error: error loading module 'libpaths' from file '/usr/local/lib/lua/5.1/libpaths.so':
    /usr/local/lib/lua/5.1/libpaths.so: undefined symbol: lua_gettop

I searched for this error here and everywhere on the web but they either suggest (1) to include -Wl,-E to export symbols or (2) to add dependencies on linking which I did. (1) I have -Wl,-E but it seems to not be doing anything. (2) I have included the dependencies (-L/usr/local/lib -lluajit -lluaT -lTH -lm -ldl)

The python test fails not when the shared library is imported but when the ‘require torch’ inside lua is called. That is also the different thing in this case from the other cases I found.

luajit.so defines the symbol lua_gettop (nm /usr/local/lib/luajit.so to see that) lua.h defines LUA_API int (lua_gettop) (lua_State *L);

I guess when compiling c to binary all works because it finds all symbols in lua.h but using the shared library it doesn’t pick lua_gettop from luajit.so (I don’t know why).

www.luajit.org/running.html says: 'On most ELF-based systems (e.g. Linux) you need to explicitly export the global symbols when linking your application, e.g. with: -Wl,-E require() tries to load embedded bytecode data from exported symbols (in *.exe or lua51.dll on Windows) and from shared libraries in package.cpath.'

package.cpath and package.path are:

./?.so;/usr/local/lib/lua/5.1/?.so;/usr/local/lib/lua/5.1/loadall.so

./?.lua;/usr/local/share/luajit-2.0.2/?.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua

Here is what nm libcluaf.so returns:

00000000002020a0 B __bss_start
00000000002020a0 b completed.6972
                 w __cxa_finalize@@GLIBC_2.2.5
0000000000000a50 t deregister_tm_clones
0000000000000ac0 t __do_global_dtors_aux
0000000000201dd8 t __do_global_dtors_aux_fini_array_entry
0000000000202098 d __dso_handle
0000000000201de8 d _DYNAMIC
00000000002020a0 D _edata
00000000002020a8 B _end
0000000000000d28 T _fini
0000000000000b00 t frame_dummy
0000000000201dd0 t __frame_dummy_init_array_entry
0000000000000ed0 r __FRAME_END__
0000000000202000 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
0000000000000918 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
0000000000201de0 d __JCR_END__
0000000000201de0 d __JCR_LIST__
                 w _Jv_RegisterClasses
                 U lua_getfield
0000000000000d99 R luaJIT_BC_tensor
                 U luaL_loadfile
                 U luaL_newstate
                 U luaL_openlibs
                 U lua_pcall
                 U lua_settop
                 U luaT_newmetatable
                 U lua_tolstring
                 U luaT_pushudata
                 U luaT_toudata
                 U lua_type
0000000000000b35 T multiply
                 U printf@@GLIBC_2.2.5
0000000000000a80 t register_tm_clones
                 U THFloatStorage_newWithData
                 U THFloatTensor_newWithStorage2d
00000000002020a0 d __TMC_END__

Thanks in advance

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

On Linux Lua modules don't link to the Lua library directly but instead expect to find the Lua API functions already loaded. This is usually done by exporting them from the interpreter using the -Wl,-E linker flag. This flag only works for symbols in executables, not shared libraries. For shared libraries there exists something similar: the RTLD_GLOBAL flag for the dlopen function. By default all shared libraries listed on the compiler command line are loaded using RTLD_LOCAL instead, but fortunately Linux reuses already opened library handles. So you can either:

Preload the Lua(JIT) library using RTLD_GLOBAL before it gets loaded automatically (which happens when you load libcluaf.so):

from ctypes import byref, cdll, c_int
import ctypes

lualib = ctypes.CDLL("libluajit-5.1.so", mode=ctypes.RTLD_GLOBAL)
l = cdll.LoadLibrary('absolute_path_to_so/libcluaf.so')
# ...

Or change the flags of the Lua(JIT) library handle afterwards using the RTLD_NOLOAD flag for dlopen. This flag is not in POSIX though, and you probably have to use C to do so. See e.g. here.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...