Sign in to follow this  

[web] A new job, and many questions

This topic is 4302 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I have my first programming job, developing PHP applications in a Travel Agency. I'm so happy! "Professional" just sounds cool (even if its only web dev). [grin] Well but I have a few problems. We use a 3rd-party Internet Booking Engine (IBE) to sell our travels, which is currently integrated into the site via iframe. We want to add content to the html of the IBE. To do so, I have to replace the iframe and embed the IBE HTML content directly into our site. Currently, I'm doing this: -Retrieving the IBE start page -Inlining the javascript -Replacing any relative URLS with absolute ones which calls our site with additional parameters giving the IBE's current page URL and form data So, do you have any advice on how to do this (integrating 3rd party content, spidering, scraping) cleanly? Are there any libraries/tutorials/sites/blogs/IDEs/... you would recommend to a PHP beginner?

Share this post


Link to post
Share on other sites
I would be interested to know the answer to this, I would like to include some of my sites contents on other peoples websites by including text, possibly through javascript.

As your site uses PHP, you might be able to open a connection to the 3rd party site and retrieve the page, then if its in XML process it and print it out as desired. If its pure HTML you can still print it out but processing and restructuring the data will be a lot more limited.

Share this post


Link to post
Share on other sites
you could use curl or even just file(); in php to get all of the data off of the page. Depending on how their content is layed out the only challangeing part would be parsing the data.


<?php
$contents = file('http://www.gamedev.net');

// Do something with contents

foreach($contents as $line)
{
echo "$line"; // I believe each line already retains it's \n character
}
?>


Share this post


Link to post
Share on other sites
Well, it's HTML, and parsing is a REAL mess, because they use javascript to generate HTML with embedded javascript which generates HTML and javascript. [imwithstupid]
So the problem is
a) dealing with links relative to their site
b) forwarding their form (POST) data
c) adjusting their javascript.

In addition, I'm asking for helpful things related to PHP, mySQL, javascript.
If you could tell me helpful libraries, sites, ebooks, editors, IDEs, ... I'd be forever grateful. [smile]

Share this post


Link to post
Share on other sites
If you need to retrieve information from a page that needs POST data then CURL is essential. Are you just trying to get the links out? If so you should be able to easily search for <a and </a> and grab the content in between.

Share this post


Link to post
Share on other sites
Your best bet is to get them to produce a machine-readable protocol. Using screen-scraping in a production application is an unbelievably bad idea, as the other party could change the format of their HTML at any time without notifying you, and your system would then break.

You should buy their server-server integration product instead.

If they are a payment service provider, you may find that you need to satisfy due diligence, and get a SSL certificate in order to do this. Live with it - it's the only way of doing things.

Moreover, screen-scraping is probably against their terms and conditions, so they would be well within their rights to terminate your account.

It's in their interests to make their kit easy to integrate with. I'm sure they have a machine-readable protocol.

Mark

Share this post


Link to post
Share on other sites
I have no other options, so I'm thinking of either
a) caching and inlining the IBE HTML with PHP or
b) using javascript to scrape external content on the client side.

I have another question. I'm looking for a "free web service" directory, as I need weather data. With google, I came across uddi.org, but I still can't find a single listing with cost-free web services. The only ones I know are the google, ebay and amazon APIs.
Do you know any such "web service directories"?

Share this post


Link to post
Share on other sites

This topic is 4302 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this