|
Server Help Community forums for Subgame, ASSS, and bots
|
Author |
Message |
bontinum Guest
Offline
|
Posted: Mon Aug 13, 2012 12:45 am Post subject: Basic winsock trouble getting correct source |
|
|
|
|
Hi i'm trying to make a simple program to get first 512 characters from source page of macys webpage.
simple code
string url = "GET http://www1.macys.com/shop/product/?ID=";
url += argv[1]; // Add product id to URL
url += "\r\nHTTP/1.1\r\n";
url += "Host: www1.macys.com\r\n";
url += "Connection: close\r\n";
char buf[BUFFSIZE]; // 512
send(sock,url.c_str(),url.length(),0); // Send socket
recv(sock,buf,BUFFSIZE,0);
|
The problem is the following. Lets say products web id is 546916 so url would be http://www1.macys.com/shop/product/?ID=546916
The source i recieve is
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html xmlns:jsp="http://java.sun.com/products/jsp/dtd/jsp_1_0.dtd" lang="en-us">
<head>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<META NAME="ROBOTS" CONTENT="NOINDEX"><!-- o -->
<title>Not Found - Macy's</title>
<meta http-equiv="generator" content="JACPKMALPHTCSJDTCR">
<script type="text/javascript" language="JavaScript">
<!--
function getForwardPageURL()
????????
|
While infact if you put the URL into the web browser the source should be
<!DOCTYPE html>
<html lang="en-us">
<head>
<title>JS Boutique Dress, One Shoulder Draped Evening Dress - Womens Dresses - Macy's</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<meta http-equiv="generator" content="JACPKMALPHTCSJDTCR" /> |
The meta name robots in the source that my program recieves makes me feel macys did something. what can i do? please help. |
|
Back to top |
|
|
Cheese Wow Cheese is so helpful!
Joined: Mar 18 2007 Posts: 1017 Offline
|
Posted: Wed Aug 15, 2012 1:38 pm Post subject: |
|
|
|
|
user agents are different, and the content it sends is different accordingly
also your get is wrong, thats a 404 page _________________ SSC Distension Owner
SSCU Trench Wars Developer |
|
Back to top |
|
|
guesty Guest
Offline
|
Posted: Tue Sep 04, 2012 4:14 pm Post subject: |
|
|
|
|
kinda pointless to post but just figured if any one comes across the problem (4e-60000% chance) but its google which after sending about 700 requests to google search it asks for a captcha :\ |
|
Back to top |
|
|
|
|
You can post new topics in this forum You can reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You can attach files in this forum You can download files in this forum
|
Software by php BB © php BB Group Server Load: 8 page(s) served in previous 5 minutes.
|