CURL PHP: what is it and how to use it? Sending GET requests in cURL Making a POST request to a specific URL

cURL is a special tool that is designed to transfer files and data using the URL syntax. This technology supports many protocols such as HTTP, FTP, TELNET and many others. cURL was originally designed to be a command line tool. Luckily for us, the cURL library is supported by the PHP programming language. In this article, we will look at some of the advanced cURL features, as well as touch on the practical application of the acquired knowledge using PHP.

Why cURL?

In fact, there are quite a few alternative ways to fetch the content of a web page. In many cases, mostly out of laziness, I have used simple PHP functions instead of cURL:

$content = file_get_contents("http://www.nettuts.com"); // or $lines = file("http://www.nettuts.com"); // or readfile("http://www.nettuts.com");

However, these functions have virtually no flexibility and contain a huge number of shortcomings in terms of error handling and so on. In addition, there are certain tasks that you simply cannot solve with these standard functions: interacting with cookies, authentication, submitting a form, uploading files, and so on.

cURL is a powerful library that supports many different protocols, options, and provides detailed information about URL requests.

Basic structure

  • Initialization
  • Assigning parameters
  • Execution and fetching the result
  • Freeing up memory

// 1. initialization $ch = curl_init(); // 2. specify parameters including url curl_setopt($ch, CURLOPT_URL, "http://www.nettuts.com"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_HEADER, 0); // 3. get HTML as result $output = curl_exec($ch); // 4. close the connection curl_close($ch);

Step #2 (that is, calling curl_setopt()) will be discussed in this article much more than all the other steps, because. at this stage, all the most interesting and useful things that you need to know happen. There are a huge number of different options in cURL that must be specified in order to be able to configure a URL request in the most thorough manner. We will not consider the entire list as a whole, but will focus only on what I consider necessary and useful for this lesson. Everything else you can explore yourself if this topic interests you.

Error Check

In addition, you can also use conditional statements to test whether an operation succeeded:

// ... $output = curl_exec($ch); if ($output === FALSE) ( echo "cURL Error: " . curl_error($ch); ) // ...

Here I ask you to note a very important point for yourself: we must use “=== false” for comparison, instead of “== false”. For those who are not in the know, this will help us distinguish between an empty result and a false boolean value, which will indicate an error.

Receiving the information

Another additional step is to get data about the cURL request after it has been executed.

// ... curl_exec($ch); $info = curl_getinfo($ch); echo "Took" . $info["total_time"] . " seconds for url " . $info["url"]; // ...

The returned array contains the following information:

  • "url"
  • "content_type"
  • http_code
  • “header_size”
  • "request_size"
  • “filetime”
  • “ssl_verify_result”
  • “redirect_count”
  • “total_time”
  • “namelookup_time”
  • “connect_time”
  • "pretransfer_time"
  • "size_upload"
  • size_download
  • “speed_download”
  • “speed_upload”
  • "download_content_length"
  • “upload_content_length”
  • "starttransfer_time"
  • "redirect_time"

Redirect detection depending on the browser

In this first example, we will write code that can detect URL redirects based on various browser settings. For example, some websites redirect the browsers of a cell phone, or any other device.

We're going to use the CURLOPT_HTTPHEADER option to determine our outgoing HTTP headers, including the user's browser name and available languages. Eventually, we will be able to determine which sites are redirecting us to different URLs.

// test URL $urls = array("http://www.cnn.com", "http://www.mozilla.com", "http://www.facebook.com"); // testing browsers $browsers = array("standard" => array ("user_agent" => "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5 .6 (.NET CLR 3.5.30729)", "language" => "en-us,en;q=0.5"), "iphone" => array ("user_agent" => "Mozilla/5.0 (iPhone; U ; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1A537a Safari/419.3", "language" => "en"), "french" => array ("user_agent" => "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; GTB6; .NET CLR 2.0.50727)", "language" => "fr,fr-FR;q=0.5")); foreach ($urls as $url) ( echo "URL: $url\n"; foreach ($browsers as $test_name => $browser) ( $ch = curl_init(); // specify url curl_setopt($ch, CURLOPT_URL, $url); // set browser headers curl_setopt($ch, CURLOPT_HTTPHEADER, array("User-Agent: ($browser["user_agent"])", "Accept-Language: ($browser["language"])" )); // we don't need page content curl_setopt($ch, CURLOPT_NOBODY, 1); // we need to get HTTP headers curl_setopt($ch, CURLOPT_HEADER, 1); // return results instead of output curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($ch); curl_close($ch); // Was there an HTTP redirect? if (preg_match("!Location: (.*)!", $output, $matches)) ( echo " $test_name: redirects to $matches\n"; ) else ( echo "$test_name: no redirection\n"; ) ) echo "\n\n"; )

First, we specify a list of URLs of sites that we will check. More precisely, we need the addresses of these sites. Next, we need to define browser settings in order to test each of these URLs. After that, we will use a loop in which we will run through all the results obtained.

The trick we use in this example to set the cURL settings will allow us to get not the content of the page, but only the HTTP headers (stored in $output). Next, using a simple regex, we can determine if the string “Location:” was present in the received headers.

When you run this code, you should get something like this:

Making a POST request to a specific URL

When forming a GET request, the transmitted data can be passed to the URL via a “query string”. For example, when you do a Google search, the search term is placed in the address bar of the new URL:

http://www.google.com/search?q=ruseller

You don't need to use cURL to simulate this request. If laziness finally overcomes you, use the “file_get_contents ()” function in order to get the result.

But the thing is, some HTML forms send POST requests. The data of these forms is transported through the body of the HTTP request, and not as in the previous case. For example, if you filled out a form on a forum and clicked on the search button, then most likely a POST request will be made:

http://codeigniter.com/forums/do_search/

We can write a PHP script that can simulate this kind of URL request. First, let's create a simple file to accept and display POST data. Let's call it post_output.php:

Print_r($_POST);

We then create a PHP script to execute the cURL request:

$url = "http://localhost/post_output.php"; $post_data = array("foo" => "bar", "query" => "Nettuts", "action" => "Submit"); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // indicate that we have a POST request curl_setopt($ch, CURLOPT_POST, 1); // add variables curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data); $output = curl_exec($ch); curl_close($ch); echo $output;

When you run this script, you should get a similar result:

Thus, the POST request was sent to the post_output.php script, which in turn outputted the $_POST superglobal array, the contents of which we obtained using cURL.

File upload

First, let's create a file in order to form it and send it to the upload_output.php file:

Print_r($_FILES);

And here is the script code that performs the above functionality:

$url = "http://localhost/upload_output.php"; $post_data = array ("foo" => "bar", // file to upload "upload" => "@C:/wamp/www/test.zip"); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data); $output = curl_exec($ch); curl_close($ch); echo $output;

When you want to upload a file, all you have to do is pass it in as a regular post variable, preceded by the @ symbol. When you run the written script, you will get the following result:

Multiple cURL

One of the biggest strengths of cURL is the ability to create "multiple" cURL handlers. This allows you to open a connection to multiple URLs at the same time and asynchronously.

In the classic version of the cURL request, the execution of the script is suspended, and the URL request operation is expected to complete, after which the script can continue. If you intend to interact with a whole lot of URLs, this will be quite time consuming, since in the classic case you can only work with one URL at a time. However, we can fix this situation by using special handlers.

Let's take a look at the code example I took from php.net:

// create some cURL resources $ch1 = curl_init(); $ch2 = curl_init(); // specify URL and other parameters curl_setopt($ch1, CURLOPT_URL, "http://lxr.php.net/"); curl_setopt($ch1, CURLOPT_HEADER, 0); curl_setopt($ch2, CURLOPT_URL, "http://www.php.net/"); curl_setopt($ch2, CURLOPT_HEADER, 0); //create a multiple cURL handler $mh = curl_multi_init(); //adding multiple handlers curl_multi_add_handle($mh,$ch1); curl_multi_add_handle($mh,$ch2); $active = null; //execution do ( $mrc ​​= curl_multi_exec($mh, $active); ) while ($mrc == CURLM_CALL_MULTI_PERFORM); while ($active && $mrc ​​== CURLM_OK) ( if (curl_multi_select($mh) != -1) ( do ( $mrc ​​= curl_multi_exec($mh, $active); ) while ($mrc == CURLM_CALL_MULTI_PERFORM); ) ) //close curl_multi_remove_handle($mh, $ch1); curl_multi_remove_handle($mh, $ch2); curl_multi_close($mh);

The idea is that you can use multiple cURL handlers. Using a simple loop, you can keep track of which requests have not yet been completed.

In this example, there are two main loops. The first do-while loop calls the curl_multi_exec() function. This feature is non-blocking. It executes as fast as it can and returns the state of the request. While the returned value is the constant 'CURLM_CALL_MULTI_PERFORM', this means that the work has not been completed yet (for example, http headers in the URL are currently being sent); That is why we keep checking this return value until we get a different result.

In the next loop, we check the condition while $active = "true". It is the second parameter to the curl_multi_exec() function. The value of this variable will be "true" as long as any of the existing changes is active. Next, we call the curl_multi_select() function. Its execution "blocks" as long as there is at least one active connection, until a response is received. When this happens, we return to the main loop to continue executing requests.

And now let's apply what we learned with an example that will be really useful for a large number of people.

Checking Links in WordPress

Imagine a blog with a huge number of posts and messages, each of which has links to external Internet resources. Some of these links might already be "dead" for various reasons. Perhaps the page has been deleted or the site is not working at all.

We're going to create a script that will parse all links and find websites that aren't loading and 404 pages, and then provide us with a very detailed report.

I will say right away that this is not an example of creating a plugin for WordPress. This is just about everything a good testing ground for us.

Let's finally get started. First we have to fetch all links from the database:

// configuration $db_host = "localhost"; $db_user = "root"; $db_pass = ""; $db_name = "wordpress"; $excluded_domains = array("localhost", "www.mydomain.com"); $max_connections = 10; // variable initialization $url_list = array(); $working_urls = array(); $dead_urls = array(); $not_found_urls = array(); $active = null; // connect to MySQL if (!mysql_connect($db_host, $db_user, $db_pass)) ( die("Could not connect: " . mysql_error()); ) if (!mysql_select_db($db_name)) ( die("Could not select db: " . mysql_error()); ) // select all published posts with links $q = "SELECT post_content FROM wp_posts WHERE post_content LIKE "%href=%" AND post_status = "publish" AND post_type = "post ""; $r = mysql_query($q) or die(mysql_error()); while ($d = mysql_fetch_assoc($r)) ( // fetch links using regular expressions if (preg_match_all("!href=\"(.*?)\"!", $d["post_content"], $ matches)) ( foreach ($matches as $url) ( $tmp = parse_url($url); if (in_array($tmp["host"], $excluded_domains)) ( continue; ) $url_list = $url; ) ) ) // remove duplicates $url_list = array_values(array_unique($url_list)); if (!$url_list) ( die("No URL to check"); )

First, we generate configuration data for interacting with the database, then we write a list of domains that will not participate in the check ($excluded_domains). We also define a number that characterizes the number of maximum simultaneous connections that we will use in our script ($max_connections). We then join the database, select the posts that contain links, and accumulate them into an array ($url_list).

The following code is a bit complex, so understand it from start to finish:

// 1. multiple handler $mh = curl_multi_init(); // 2. add a lot of URLs for ($i = 0; $i< $max_connections; $i++) { add_url_to_multi_handle($mh, $url_list); } // 3. инициализация выполнения do { $mrc = curl_multi_exec($mh, $active); } while ($mrc == CURLM_CALL_MULTI_PERFORM); // 4. основной цикл while ($active && $mrc == CURLM_OK) { // 5. если всё прошло успешно if (curl_multi_select($mh) != -1) { // 6. делаем дело do { $mrc = curl_multi_exec($mh, $active); } while ($mrc == CURLM_CALL_MULTI_PERFORM); // 7. если есть инфа? if ($mhinfo = curl_multi_info_read($mh)) { // это значит, что запрос завершился // 8. извлекаем инфу $chinfo = curl_getinfo($mhinfo["handle"]); // 9. мёртвая ссылка? if (!$chinfo["http_code"]) { $dead_urls = $chinfo["url"]; // 10. 404? } else if ($chinfo["http_code"] == 404) { $not_found_urls = $chinfo["url"]; // 11. рабочая } else { $working_urls = $chinfo["url"]; } // 12. чистим за собой curl_multi_remove_handle($mh, $mhinfo["handle"]); // в случае зацикливания, закомментируйте данный вызов curl_close($mhinfo["handle"]); // 13. добавляем новый url и продолжаем работу if (add_url_to_multi_handle($mh, $url_list)) { do { $mrc = curl_multi_exec($mh, $active); } while ($mrc == CURLM_CALL_MULTI_PERFORM); } } } } // 14. завершение curl_multi_close($mh); echo "==Dead URLs==\n"; echo implode("\n",$dead_urls) . "\n\n"; echo "==404 URLs==\n"; echo implode("\n",$not_found_urls) . "\n\n"; echo "==Working URLs==\n"; echo implode("\n",$working_urls); function add_url_to_multi_handle($mh, $url_list) { static $index = 0; // если у нас есть ещё url, которые нужно достать if ($url_list[$index]) { // новый curl обработчик $ch = curl_init(); // указываем url curl_setopt($ch, CURLOPT_URL, $url_list[$index]); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_NOBODY, 1); curl_multi_add_handle($mh, $ch); // переходим на следующий url $index++; return true; } else { // добавление новых URL завершено return false; } }

Here I will try to put everything on the shelves. The numbers in the list correspond to the numbers in the comment.

  1. 1. Create a multiple handler;
  2. 2. We will write the add_url_to_multi_handle() function a little later. Each time it is called, a new url will be processed. Initially, we add 10 ($max_connections) URLs;
  3. 3. In order to get started, we must run the curl_multi_exec() function. As long as it returns CURLM_CALL_MULTI_PERFORM, we still have some work to do. We need this mainly in order to create connections;
  4. 4. Next comes the main loop, which will be executed as long as we have at least one active connection;
  5. 5. curl_multi_select() hangs waiting for URL lookup to complete;
  6. 6. Once again, we need to get cURL to do some work, namely to fetch the returned response data;
  7. 7. Information is being verified here. As a result of the request, an array will be returned;
  8. 8. The returned array contains a cURL handler. This is what we'll use to fetch information about a particular cURL request;
  9. 9. If the link was dead, or the script ran out of time, then we should not look for any http code;
  10. 10. If the link returned us a 404 page, then the http code will contain the value 404;
  11. 11. Otherwise, we have a working link in front of us. (You can add additional checks for error code 500, etc...);
  12. 12. Next, we remove the cURL handler because we don't need it anymore;
  13. 13. Now we can add another url and run everything we talked about before;
  14. 14. At this step, the script ends its work. We can remove everything we don't need and generate a report;
  15. 15. In the end, we will write a function that will add a url to the handler. The static variable $index will be incremented every time this function is called.

I used this script on my blog (with some broken links added on purpose to test it out) and got the following result:

In my case, the script took just under 2 seconds to run through 40 URLs. The performance gain is significant when dealing with even more URLs. If you open ten connections at the same time, the script can run ten times faster.

A few words about other useful cURL options

HTTP Authentication

If the URL has HTTP authentication, then you can easily use the following script:

$url = "http://www.somesite.com/members/"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // specify username and password curl_setopt($ch, CURLOPT_USERPWD, "myusername:mypassword"); // if the redirect is allowed curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // then save our data in cURL curl_setopt($ch, CURLOPT_UNRESTRICTED_AUTH, 1); $output = curl_exec($ch); curl_close($ch);

FTP upload

PHP also has a library for working with FTP, but nothing prevents you from using cURL tools here:

// open file $file = fopen("/path/to/file", "r"); // url should contain the following content $url = "ftp://username: [email protected]:21/path/to/new/file"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_UPLOAD, 1); curl_setopt($ch, CURLOPT_INFILE, $fp); curl_setopt($ch, CURLOPT_INFILESIZE, filesize("/path/to/file")); // specify ASCII mod curl_setopt($ch, CURLOPT_FTPASCII, 1); $output = curl_exec ($ch); curl_close($ch);

Using a Proxy

You can make your URL request through a proxy:

$ch = curl_init(); curl_setopt($ch, CURLOPT_URL,"http://www.example.com"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // specify the address curl_setopt($ch, CURLOPT_PROXY, "11.11.11.11:8080"); // if you need to provide a username and password curl_setopt($ch, CURLOPT_PROXYUSERPWD,"user:pass"); $output = curl_exec($ch); curl_close($ch);

Callbacks

It is also possible to specify a function that will be triggered even before the cURL request completes. For example, while the content of a response is loading, you can start using the data without waiting for it to fully load.

$ch = curl_init(); curl_setopt($ch, CURLOPT_URL,"http://net.tutsplus.com"); curl_setopt($ch, CURLOPT_WRITEFUNCTION,"progress_function"); curl_exec($ch); curl_close($ch); function progress_function($ch,$str) ( echo $str; return strlen($str); )

Such a function MUST return the length of the string, which is a requirement.

Conclusion

Today we got acquainted with how you can use the cURL library for your own selfish purposes. I hope you enjoyed this article.

Thank you! Have a good day!

Why do we need PHP CURL ?
To send HTTP GET requests, simply we can use file_get_contents() method.

File_get_contents("http://site")

But sending POST request and handling errors are not easy with file_get_contents().

Sending HTTP requests is very simple with PHP CURL.

step 1). Initialize CURL session

$ch = curl_init();

step 2). Provide options for the CURL session

Curl_setopt($ch,CURLOPT_URL,"http://site"); curl_setopt($ch,CURLOPT_RETURNTRANSFER,true); //curl_setopt($ch,CURLOPT_HEADER, true); //if you want headers

CURLOPT_URL-> URL to fetch
CURLOPT_HEADER-> to include the header/not
CURLOPT_RETURNTRANSFER-> if it is set to true, data is returned as string instead of outputting it.

step 3). Execute the CURL session

$output=curl_exec($ch);

step 4). Close the session

curl_close($ch);

note: You can check whether CURL is enabled/not with the following code.

If(is_callable("curl_init"))( echo "Enabled"; ) else ( echo "Not enabled"; )

1.PHP CURL GET Example

You can use the code below to send a GET request.

Function httpGet($url) ( $ch = curl_init(); curl_setopt($ch,CURLOPT_URL,$url); curl_setopt($ch,CURLOPT_RETURNTRANSFER,true); // curl_setopt($ch,CURLOPT_HEADER, false); $output= curl_exec($ch); curl_close($ch); return $output; ) echo httpGet("http://site");

2.PHP CURL POST Example


You can use the below code to submit form using PHP CURL.

Function httpPost($url,$params) ( $postData = ""; //create name value pairs seperated by & foreach($params as $k => $v) ( $postData .= $k . "=".$ v."&"; ) $postData = rtrim($postData, "&"); $ch = curl_init(); curl_setopt($ch,CURLOPT_URL,$url); curl_setopt($ch,CURLOPT_RETURNTRANSFER,true); curl_setopt( $ch,CURLOPT_HEADER, false); curl_setopt($ch, CURLOPT_POST, count($postData)); curl_setopt($ch, CURLOPT_POSTFIELDS, $postData); $output=curl_exec($ch); curl_close($ch); return $ output; )

How to use the function:

$params = array("name" => "Ravishanker Kusuma", "age" => "32", "location" => "India"); echo httpPost("http://site/examples/php/curl-examples/post.php",$params);

3.Send Random User-Agent in the Requests

You can use the below function to get Random User-Agent.

Function getRandomUserAgent() ( $userAgents=array("Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6", "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)", "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)", "Opera/9.20 (Windows NT 6.0; U; en)", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.50", "Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 5.1) Opera 7.02 ", "Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; fr; rv:1.7) Gecko/20040624 Firefox/0.9", "Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit /48 (like Gecko) Safari/48"); $random = rand(0,count($userAgents)-1); return $userAgents[$random]; )

Using CURLOPT_USERAGENT, you can set User-Agent string.

Curl_setopt($ch,CURLOPT_USERAGENT,getRandomUserAgent());

4.Handle redirects (HTTP 301,302)

To handle URL redirects, set CURLOPT_FOLLOWLOCATION to TRUE.Maximum number of redirects can be controlled using CURLOPT_MAXREDIRS.

Curl_setopt($ch,CURLOPT_FOLLOWLOCATION,TRUE); curl_setopt($ch,CURLOPT_MAXREDIRS,2);//only 2 redirects

5.How to handle CURL errors

we can use curl_errno(),curl_error() methods, to get the last errors for the current session.
curl_error($ch)-> returns error as string
curl_errno($ch)-> returns error number
You can use the below code to handle errors.

Function httpGetWithErros($url) ( $ch = curl_init(); curl_setopt($ch,CURLOPT_URL,$url); curl_setopt($ch,CURLOPT_RETURNTRANSFER,true); $output=curl_exec($ch); if($output == = false) ( echo "Error Number:".curl_errno($ch)."
"; echo "Error String:".curl_error($ch); ) curl_close($ch); return $output; )

In the last article, we reviewed. However, sometimes the script only accepts GET requests(as a rule, these are search scripts). And in order to process and receive data from such scripts, you need to be able to send GET requests in cURL the same as what you will learn by reading this article.

Principle sending GET requests in cURL exactly the same as in sending by POST method: There is source file and eat destination file. Source file using module cURL, sends a GET request to the destination file. The destination file processes this request and returns the result, which is accepted by the source file, again using the possibilities cURL.

To make everything absolutely clear, let's look at a simple example that we considered when sending POST requests. That is, the source file sends two numbers, the destination file returns the sum of these numbers, which the source file receives and writes to a variable, which is then output to the browser.

To start, the destination file (" receiver.php"):

$a = $_GET["a"];
$b = $_GET["b"];
echo $a + $b;
?>

Everything is very simple here: write to variables $a And $b data from GET request, and then we display their sum, which will be accepted by the source file.

Now let's create the file itself-

The CURL library (Client URLs) allows you to transfer files to a remote computer using many Internet protocols. It has a very flexible configuration and allows you to perform almost any remote request.

CURL supports HTTP, HTTPS, FTP, FTPS, DICT, TELNET, LDAP, FILE, and GOPHER protocols, as well as HTTP-post, HTTP-put, cookies, FTP uploads, resuming interrupted file transfers, passwords, port numbers, certificates SSL, Kerberos and proxies.

Using CURL, a web server can act as a full-fledged client for any HTTP-based service, such as XML-RPC, SOAP, or WebDAV.

In general, using the library comes down to four steps:

  1. Creating a CURL resource using the curl_init function.
  2. Setting parameters using the curl_setopt function.
  3. Executing a request using the curl_exec function.
  4. Freeing the CURL resource using the curl_close function.

A simple example of using CURL

// Initialization of the curl library
if ($ch = @curl_init())
{
// Set request URL
@curl_setopt($ch , CURLOPT_URL , "http://server.com/" );
// If true, CURL includes headers in the output
@
// Where to put the result of the query execution:
// false - to the standard output stream,
// true - as the return value of the curl_exec function.
@
// Maximum wait time in seconds
@
// Set the value of the User-agent field
@curl_setopt($ch , CURLOPT_USERAGENT , "PHP Bot (http://blog.yousoft.ru)");
// Execute the request
$data = @curl_exec($ch);
// Output received data
echo $data ;
// Specializing the resource
@curl_close($ch);
}
?>

An example of using a GET request

$ch = curl_init();
// GET request is specified in the URL string
curl_setopt($ch , CURLOPT_URL , "http://server.com/?s=CURL" );
curl_setopt($ch , CURLOPT_HEADER , false );
curl_setopt($ch , CURLOPT_RETURNTRANSFER , true );
curl_setopt($ch , CURLOPT_CONNECTTIMEOUT , 30 );

$data = curl_exec($ch);
curl_close($ch);
?>

Sending a GET request is no different from getting a page. It is important to note that the query string is formed as follows:

http://server.com/index.php?name1=value1&name2=value2&name3=value3

where http://server.com/index.php is the page address, nameX is the name of the variable, valueX is the value of the variable.

An example of using a POST request

$ch = curl_init();
curl_setopt($ch , CURLOPT_URL , "http://server.com/index.php" );
curl_setopt($ch , CURLOPT_HEADER , false );
curl_setopt($ch , CURLOPT_RETURNTRANSFER , true );
// You need to explicitly indicate that there will be a POST request
curl_setopt($ch , CURLOPT_POST , true );
// Variable values ​​are passed here
curl_setopt ($ch , CURLOPT_POSTFIELDS , "s=CURL" );
curl_setopt($ch , CURLOPT_CONNECTTIMEOUT , 30 );
curl_setopt ($ch , CURLOPT_USERAGENT , "PHP Bot (http://mysite.ru)" );
$data = curl_exec($ch);
curl_close($ch);
?>

Sending a POST request is not much different from sending a GET request. All basic steps remain the same. Variables are also specified in pairs: name1=value1&name2=value2 .

HTTP Authorization Example

// HTTP authorization
$url = "http://server.com/protected/";
$ch = curl_init();


curl_setopt ($ch , CURLOPT_USERPWD , "myusername:mypassword" );
$result = curl_exec($ch);
curl_close($ch);
echo $result ;
?>

FTP session example

$fp = fopen(__FILE__ , "r" );
$url = "ftp://username: [email protected]:21/path/to/newfile.php";
$ch = curl_init();
curl_setopt($ch , CURLOPT_URL , $url );
curl_setopt($ch , CURLOPT_RETURNTRANSFER , 1 );
curl_setopt($ch , CURLOPT_UPLOAD , 1 );
curl_setopt($ch , CURLOPT_INFILE , $fp );
curl_setopt($ch , CURLOPT_FTPASCII , 1 );
curl_setopt ($ch , CURLOPT_INFILESIZE , filesize (__FILE__ ));
$result = curl_exec($ch);
curl_close($ch);
?>

If you are having problems using cURL, add the following lines before curl_close to get a report of the last request made:

print_r(curl_getinfo($ch));
echo "cURL error number:" . curl_errno($ch ). "
" ;
echo "cURL error:" . curl_error ($ch ). "
" ;
curl_close($ch);
?>

This article assumes that you are familiar with the basics of networking and the HTML language.

The ability to write scripts is essential to building a good computer system. The extensibility of Unix systems through shell scripts and various programs that execute automated commands is one of the reasons why they are so successful.

The increasing number of applications that are moving to the web has led to the fact that the topic of HTTP scripts is becoming more and more in demand. Important tasks in this area are the automatic extraction of information from the Internet, sending or downloading data to web servers, etc.

Curl is a command line tool that allows you to do URL manipulation and passing various kinds. This article focuses on making simple HTTP requests. It is assumed that you already know where to dial

# curl --help

# curl --manual

for information about curl.

Curl is not a tool that will do everything for you. It creates requests, receives data, and sends data. You may need some "glue" to hold everything together, perhaps some scripting language (like bash) or a few manual calls.

1. HTTP protocol

HTTP is the protocol used when receiving data from web servers. It is a very simple protocol that is built on top of TCP/IP. The protocol also allows information to be sent to the server from the client using several methods, as will be shown next.

HTTP are strings of ASCII text sent from a client to a server to request some action. When a request is received, the server responds to the client with several service text lines, and then with the actual content.

Using the curl -v option, you can see which commands curl sends to the server, as well as other informational text. The -v switch is perhaps the only way to debug or even understand the interaction between curl and the web server.

2.URL

The URL format (Uniform Resource Locator - universal resource address) specifies the address of a specific resource on the Internet. You probably know this, examples of URLs are http://curl.haxx.se or https://yourbank.com.

3. Get (GET) page

The simplest and most common HTTP request is to get the content of a URL. The URL can refer to a web page, an image, or a file. The client sends a GET request to the server and receives the requested document. If you run the command

# curl http://curl.haxx.se

you will get a web page displayed in your terminal window. The complete HTML document contained at this URL.

All HTTP responses contain a set of headers that are usually hidden. To see them along with the document itself, use the curl -i option. You can also request only headers with the -I switch (which will force curl to make a HEAD request).

4. Shapes

Forms are the primary way of presenting a website as an HTML page with fields in which the user enters data and then clicks an "OK" or "Submit" button, after which the data is sent to the server. The server then uses the received data and decides how to proceed: look up the information in the database, show the entered address on the map, add an error message, or use the information to authenticate the user. Of course, there is some program on the server side that accepts your data.

4.1 GET

The GET form uses the GET method, like this:

If you open this code in your browser, you will see a form with a text box and a button that says "OK". If you enter "1905" and click OK, the browser will generate a new URL to follow. The URL will be a string consisting of the path of the previous URL and a string like "junk.cgi?birthyear=1905&press=OK".

For example, if the form was located at "www.hotmail.com/when/birth.html", then clicking on the OK button will take you to the URL "www.hotmail.com/when/junk.cgi?birthyear=1905&press=OK" .

Most search engines work this way.

To have curl generate a GET request, simply enter what you would expect from the form:

# curl "www.hotmail.com/when/junk.cgi?birthyear=1905&press=OK"

4.2 POST

The GET method causes all the entered information to be displayed in the address bar of your browser. This may be fine when you need to bookmark a page, but it's an obvious disadvantage when you're entering secret information into form fields, or when the amount of information entered into the fields is too large (resulting in an unreadable URL).

The HTTP protocol provides the POST method. With it, the client sends data separately from the URL and therefore you will not see it in the address bar.

The form that generates the POST request is similar to the previous one:

Curl can form a POST request with the same data as follows:

# curl -d "birthyear=1905&press=%20OK%20" www.hotmail.com/when/junk.cgi

This POST request uses "Content-Type application/x-www-form-urlencoded", which is the most widely used way.

The data you send to the server must be properly encoded, curl won't do it for you. For example, if you want the data to contain a space, you need to replace that space with %20, and so on. Lack of attention to this issue is a common mistake, due to which the data is not transmitted as it should.

Back in 1995, an additional way to transfer data over HTTP was defined. It is documented in RFC 1867, which is why it is sometimes referred to as RFC1867-posting.

This method is mainly designed to better support file uploads. The form that allows the user to upload a file looks like this in HTML:

Note that the Content-Type is set to multipart/form-data.

To send data to such a form using curl, enter the command:

# curl -F upload=@localfilename -F press=OK

4.4 Hidden fields

A common way to communicate state information in HTML applications is by using hidden fields in forms. Hidden fields are not filled in, they are invisible to the user and are passed in the same way as regular fields.

A simple example of a form with one visible field, one hidden field and an OK button:

To send a POST request with curl, you don't have to think about whether the field is hidden or not. For curl they are all the same:

# curl -d "birthyear=1905&press=OK&person=daniel"

4.5 Find out what a POST request looks like

When you want to fill out a form and send data to the server using curl, you probably want the POST request to look exactly like the one made using the browser.

An easy way to see your POST request is to save the form's HTML page to disk, change the method to GET, and hit the "Submit" button (you can also change the URL to which the data will be submitted).

You will see that the data has been appended to the URL, separated by "?" characters, as expected when using GET forms.

5. PUT

Perhaps the best way to upload data to an HTTP server is to use PUT. Again, this requires a program (script) on the back end that knows what to do and how to accept an HTTP PUT stream.

Send file to server using curl:

# curl -T uploadfile www.uploadhttp.com/receive.cgi

6. Authentication

Authentication - passing a username and password to the server, after which it checks if you have the right to perform the requested request. Basic authentication (which curl uses by default) is clear text based, which means that the username and password will not be encrypted, but only slightly "obfuscated" with the Base64 algorithm, leaving it possible for attackers to find out this information on the way between you and HTTP server.

Tell curl to use username and password:

# curl -u name:password www.secrets.com

The site may require the use of a different authentication method (see what the server writes in the headers), in these cases, you can use the --ntlm, --digest, --negotiate or even --anyauth keys. Sometimes access to external HTTP servers occurs through a proxy, as is often done in companies and firms. An HTTP proxy may require its own username and password to access the Internet. Relevant curl key:

# curl -U proxyuser:proxypassword curl.haxx.se

If the proxy requires NTLM authentication, specify --proxy-ntlm, if the Digest method, then --proxy-digest.

If you do not specify a password in the -u and -U options, then curl will ask you for it interactively.

Note that when curl is running, the run string (and with it the keys and passwords) may be visible to other users on your system in the task list. There are ways to prevent this. More on that below.

7. Referer

An HTTP request may include a "referer" field that indicates the URL from which the user came to this resource. Some programs/scripts check the "referer" field and do not execute the request if the user came from an unknown page. Although this is a silly way to check, many scripts use it nonetheless. With curl, you can put anything in the "referer" field and thus force it to do what you want.

This is done in the following way:

# curl -e http://curl.haxx.se daniel.haxx.se

8.User Agent

All HTTP requests support a "User-Agent" field that specifies the user's client application. Many web applications use this information to display the page in one way or another. Web programmers create several versions of a page for users of different browsers in order to improve the appearance, use different javascript, vbscript, etc. scripts.

Sometimes you may find that curl returns a page that is not what you saw in your browser. In this case, it is just appropriate to use the "User Agent" field in order to once again deceive the server.

Disguise curl as Internet Explorer on a Windows 2000 machine:

# curl -A "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

Why not become Netscape 4.73 on a Linux machine (PIII):

# curl -A "Mozilla/4.73 (X11; U; Linux 2.2.15 i686)"

9. Redirects

In response to your request, the server, instead of the page itself, may return an indication of where the browser should go next to get to the desired page. The header that tells the browser this redirect is "Location:".

By default, curl does not go to the address specified in "Location:", but simply displays the page as usual. But you can send it like this:

# curl -L www.sitethatredirects.com

If you're using curl to POST requests to a site that immediately redirects to another page, you can safely use -L and -d/-F. Curl will make a POST request for the first page and then a GET request for the next one.

10. Cookies

Cookies allow web browsers to control state on the client side. Cookie is a name with content attached. The server, by sending cookies, tells the client the path and host name where the next time cookies should be sent, tells the cookie lifetime and some other parameters.

When a client connects to the server at the address specified in the received cookie, the client sends that cookie to the server (if the lifetime has not expired).

Many applications and servers use this method to combine multiple requests into one logical session. In order for curl to also perform this function, we must be able to save and send cookies, just like browsers do.

The easiest way to send a cookie to the server when fetching a page with curl is to add the appropriate option on the command line:

# curl -b "name=Daniel" www.cookiesite.com

Cookies are sent as regular HTTP headers. This allows curl to store cookies by storing headers. Saving cookies with curl is done with the command:

# curl -D headers_and_cookies www.cookiesite.com

(by the way, it is better to use the -c switch to save cookies, more on that below).

curl has a fully featured cookie handler which is useful when you want to connect to the server again and use the cookies you saved last time (or hand-crafted). To use cookies stored in a file, call curl like this:

# curl -b stored_cookies_in_file www.cookiesite.com

The curl cookie engine is enabled when you specify the -b switch. If you want curl to only accept cookies, use -b with a file that doesn't exist. For example, if you want curl to accept cookies from a page and then follow a redirect (perhaps giving away the cookie that was just accepted), you can call curl like this:

# curl -b nada -L www.cookiesite.com

Curl can read and write cookies in Netscape and Mozilla format. This is a convenient way to exchange cookies between browsers and automated scripts. The -b switch automatically detects if a given cookie is a cookie of the specified browsers and handles it accordingly, and using the -c/--cookie-jar switch you can force curl to write a new cookie when the operation completes:

# curl -b cookies.txt -c newcookies.txt www.cookiesite.com

11. HTTPS

There are several ways to secure your HTTP transmissions. The most well-known protocol that solves this problem is HTTPS, or HTTP over SSL. SSL encrypts all data sent and received over the network, which increases the likelihood that your information will remain secret.

Curl supports requests to HTTPS servers thanks to the free OpenSSL library. Requests are made in the usual way:

# curl https://that.secure.server.com

11.1 Certificates

In the HTTPS world, you use certificates for authentication in addition to the username and password. Curl supports certificates on the client side. All certificates are locked with a passphrase that you need to enter before curl can start working with them. The passphrase can be specified either on the command line or entered interactively. Certificates in curl are used like this:

# curl -E mycert.pem https://that.secure.server.com

Curl also authenticates the server by verifying the server's certificate against a locally stored one. A mismatch will result in curl refusing to connect. To ignore authentication, use the -k switch.

More information about certificates can be found at http://curl.haxx.se/docs/sslcerts.html.

12. Arbitrary request headers

You may need to modify or add elements of individual curl requests.

For example, you can change the POST request to PROPFIND and send the data as "Content-Type: text/xml" (instead of the usual Content-Type):

# curl -d " " -H "Content-Type: text/xml" -X PROPFIND url.com

You can remove any header by specifying it without content. For example, you can remove the "Host:" header, thereby making the request "empty":

# curl -H "Host:" http://mysite.com

You can also add headers. Your server may need a "Destination:" header:

# curl -H "Destination: http://moo.com/nowhere" http://url.com

13. Debugging

It often happens that a site responds to curl requests differently than browser requests. In this case, you need to assimilate curl to the browser as much as possible:

  • Use the --trace-ascii switch to save a detailed report of requests so that you can examine them in detail and understand the problem.
  • Make sure you check for cookies and use them when needed (read -b switch and save -c switch)
  • Specify one of the latest popular browsers in the "user-agent" field
  • Fill in the "referer" field as the browser does
  • If you are using POST requests, make sure that all fields are passed in the same order as the browser (see above, point 4.5)

A good helper in this difficult task is the Mozilla/Firefox LiveHTTPHeader plugin, which allows you to view all the headers that this browser sends and receives (even when using HTTPS).

A more low level approach is to capture HTTP traffic on the network using programs like ethereal or tcpdump and then analyze what headers were received and sent by the browser (HTTPS makes this approach inefficient).

RFC 2616 is required reading for anyone who wants to understand the HTTP protocol.

RFC 2396 explains the URL syntax.

RFC 2109 defines how cookies work.

RFC 1867 defines the File Upload Post format.

http://openssl.planetmirror.com - home page of the OpenSSL project

http://curl.haxx.se - cURL project homepage

 
Articles By topic:
Pasta with tuna in creamy sauce Pasta with fresh tuna in creamy sauce
Pasta with tuna in a creamy sauce is a dish from which anyone will swallow their tongue, of course, not just for fun, but because it is insanely delicious. Tuna and pasta are in perfect harmony with each other. Of course, perhaps someone will not like this dish.
Spring rolls with vegetables Vegetable rolls at home
Thus, if you are struggling with the question “what is the difference between sushi and rolls?”, We answer - nothing. A few words about what rolls are. Rolls are not necessarily Japanese cuisine. The recipe for rolls in one form or another is present in many Asian cuisines.
Protection of flora and fauna in international treaties AND human health
The solution of environmental problems, and, consequently, the prospects for the sustainable development of civilization are largely associated with the competent use of renewable resources and various functions of ecosystems, and their management. This direction is the most important way to get
Minimum wage (minimum wage)
The minimum wage is the minimum wage (SMIC), which is approved by the Government of the Russian Federation annually on the basis of the Federal Law "On the Minimum Wage". The minimum wage is calculated for the fully completed monthly work rate.