Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
730 views
in Technique[技术] by (71.8m points)

pdf to html conversion in php

in my php script i want to convert a pdf file to html format and while doing this the generated html file contents should not to be disturbed ....

i found http://sourceforge.net/projects/pdftohtml/ but it is command line tool and need shell access. Second thing is generated html file content get disturbed..

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Can the shell command be executed from php?

$rtn = exec ('CLI Command to execute', $emptyVartoCaptureOutput);

The command is executed in the shell, run under the context of the user running the php script (_WWW or similar for scripts run from apache webserver). All the output from the command is captured into an array when you supply the second optional argument.

Seems like this might solve your problem.

In response to your comment:

The tool you reference in your original post is the command line tool you would execute - you need to figure out the exact command to execute including any and all arguments for that command.

I am not familiar with the tool you reference, but I suspect that it has various options. A couple of important options to look at are where the generated html goes. I would guess it can go either to a file (that would require _WWW to have write permissions to a directory which is a huge security risk) or to std out. When you use the exec command from php, any output sent to std out is saved as an array, a new element for each new line, when you pass the exec function an optional second parameter. Thus you should be able to capture and manipulate and / or display the outputted html dynamically from your script.

For a simple html page that only displays the html from a pdf, you might do something like this:

<std header stuff omitted for brevity>
<?php
$rtn = exec('CLI Command to Execute -a option1 -b option2', $ouputted_html);
foreach ($ouputted_html as $val){
    echo $val . "
";
}
?>
</body>
</html>

You could use echo implode(" ", $outputted_html); in place of the foreach loop to accomplish the same, but the foreach loop allows you some control over each line if you choose to take advantage of it.

Note that the generated html may or may not contain header info, you will have to experiment and see. Obviously you can add what a standard html page needs or subtract if already provided.

So you now have the basis for displaying the pdf files as html, if you need specific help with the intricacies of the tool, I suggest you seek out a forum or listserv dedicated to that tool or perhaps request help from the developer(s) after reading the docs and FAQs.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...