Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
869 views
in Technique[技术] by (71.8m points)

.net - C# Excel file OLEDB read HTML IMPORT

I have to automate something for the finance dpt. I've got an Excel file which I want to read using OleDb:

string connectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=A_File.xls;Extended Properties=""HTML Import;IMEX=1;""";

using (OleDbConnection connection = new OleDbConnection())
{
    using (DbCommand command = connection.CreateCommand())
    {
        connection.ConnectionString = connectionString;
        connection.Open();

        DataTable dtSchema = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);                        
        if( (null == dtSchema) || ( dtSchema.Rows.Count <= 0 ) )                        
        {                                
            //raise exception if needed                        
        }

        command.CommandText = "SELECT * FROM [NameOfTheWorksheet$]";

        using (DbDataReader dr = command.ExecuteReader())
        {
            while (dr.Read())
            {
                //do something with the data
            }
        }
    }
}

Normally the connectionstring would have an extended property "Excel 8.0", but the file can't be read that way because it seems to be an html file renamed to .xls. when I copy the data from the xls to a new xls, I can read the new xls with the E.P. set to "Excel 8.0".

Yes, I can read the file by creating an instance of Excel, but I rather not.. Any idea how I can read the xls using OleDb without making manual changes to the xls or by playing with ranges in a instanciated Excel?

Regards,

Michel

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I asked this same question on another forum and got the answer so I figured I'd share it here. As per this article: http://ewbi.blogs.com/develops/2006/12/reading_html_ta.html

Instead of using the sheetname, you must use the page title in the select statement without the $. SELECT * FROM [HTMLPageTitle]


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...