技术贴教你如何访问日本著名网站dmm
先上图看下效果
之前使用xx-net一翻一墙,但是一直都无法访问dmm,正好arzon被封,简直没法按系列或者按女优去搜番号,后来自己研究了下,发现在使用xx-net的gae代理是可以访问的,不知道为什么配合xx-net就不行,所以自己写了一个很小的程序,简单实现了下代理访问
注意!想用我这个方法实现访问dmm必须先启动xx-net,可以百度下xx-net教程,githup上就有
https://github.com/XX-net/XX-Net/wiki/XXNET%E8%B6%85%E8%AF%A6%E7%BB%86%E6%95%99%E7%A8%8B
下面说下怎么搞能访问到dmm
先到这里下载压缩包 http://pan点baidu.com/s/1eSPqZ2y 提取码**** Hidden Message *****
这个是我把写好的代理直接部署在tomcat下了,直接把压缩包解压到任何目录,启动 apache-tomcat-7.0.59\bin\startup.bat启动成功后打开浏览器访问
http://127.0.0.1:8080/web-jersey/rest/users
就可以访问dmm了,这个我做了一个非常非常简单的代理,把css放到了项目里,但是所有的js都干掉了,所以可能样式上稍稍不太给力,这个我以后看看有机会再优化下
暂时来说可以点大部分链接访问了
如果有人想研究研究,那我不介意把代码贴上来,因为真的十分简单,全加起来不到70行代码,就是写的比较恶心,毕竟就30分钟写出来,写完就立马贴上来了,所以比较粗糙
複製代碼import org.apache.ht@Gue^33D-9#2nv@tp.ht@Gue^33D-9#2nv@tpHost;
import org.apache.ht@Gue^33D-9#2nv@tp.client.methods.CloseablehttpResponse;
import org.apache.ht@Gue^33D-9#2nv@tp.client.methods.ht@Gue^33D-9#2nv@tpGet;
import org.apache.ht@Gue^33D-9#2nv@tp.impl.client.CloseablehttpClient;
import org.apache.ht@Gue^33D-9#2nv@tp.impl.client.ht@Gue^33D-9#2nv@tpClients;
import org.apache.ht@Gue^33D-9#2nv@tp.impl.conn.DefaultProxyRoutePlanner;
import org.apache.ht@Gue^33D-9#2nv@tp.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import javax.ws.rs.*;
import javax.ws.rs.core.MediaType;
/**
* Created by Administrator on 2016/8/4.
*/
@Path(\\"/users\\")
public class controller {
@GET
@Produces(MediaType.TEXT_HTML)
public String getIt() throws Exception{
httpGet get = new httpGet(\\"http://www.dmm.co.jp/digital/videoa/-/list/=/sort=ranking/\\");
httpHost proxy = new httpHost(\\"127.0.0.1\\", 8087);
DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
CloseablehttpClient client = ht@Gue^33D-9#2nv@tpClients.custom()
.setRoutePlanner(routePlanner)
.build();
CloseablehttpResponse response = client.execute(get);
String content = EntityUtils.toString(response.getEntity(), \\"utf-8\\");
Document document = Jsoup.parse(content);
// Element element = document.getElementById(\\"w\\");
Element element = document.body();
return handle(element.html());
}
@Path(\\"get\\")
@GET
@Produces(MediaType.TEXT_HTML)
public String getUrl(@QueryParam(\\"url\\") String url) throws Exception{
System.out.println(url);
httpGet get = new httpGet(url);
httpHost proxy = new httpHost(\\"127.0.0.1\\", 8087);
DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
CloseablehttpClient client = ht@Gue^33D-9#2nv@tpClients.custom()
.setRoutePlanner(routePlanner)
.build();
CloseablehttpResponse response = client.execute(get);
String content = EntityUtils.toString(response.getEntity(), \\"utf-8\\");
Document document = Jsoup.parse(content);
// Element element = document.getElementById(\\"w\\");
Element element = document.body();
System.out.println(element.html());
return handle(element.html());
}
private String handle(String html){
String content = html.replaceAll(\\"href=\\\"/\\",\\"href=\\\"http://127.0.0.1:8080/rest/users/get?url=http://www.dmm.co.jp/\\").replaceAll(\\"href=\\\"http://www\\",\\"href=\\\"http://127.0.0.1:8080/rest/users/get?url=http://www\\");
content = content.replaceAll(\\"(?s)<script.*?</script>\\",\\"\\");
content = \\"<!DOCTYPE html><html><head><link href=\\\"/static/base.css\\\" media=\\\"screen\\\" rel=\\\"stylesheet\\\" type=\\\"text/css\\\" /><link href=\\\"/static/list.css\\\" media=\\\"screen\\\" rel=\\\"stylesheet\\\" type=\\\"text/css\\\" /><link href=\\\"/static/digital.css\\\" media=\\\"screen\\\" rel=\\\"stylesheet\\\" type=\\\"text/css\\\" /></head><body name=\\\"dmm_main\\\">\\"+content+\\"</body></html>\\";
return content;
}
}
主要是jersey和httpclient jsoup实现的,pom文件
複製代碼<?xml version=\\"1.0\\" encoding=\\"UTF-8\\"?>
<project xmlns=\\"http://maven.apache.org/POM/4.0.0\\"
xmlns:xsi=\\"http://www.w3.org/2001/XMLSchema-instance\\"
xsi:schemaLocation=\\"http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd\\">
<modelVersion>4.0.0</modelVersion>
<groupId>com.xz.web</groupId>
<artifactId>web-jersey</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>war</packaging>
<dependencies>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-core</artifactId>
<version>1.8</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-server</artifactId>
<version>1.8</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-client</artifactId>
<version>1.8</version>
</dependency>
<dependency>
<groupId>javax.ws.rs</groupId>
<artifactId>jsr311-api</artifactId>
<version>1.1.1</version>
</dependency>
<dependency>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
<version>3.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.3.2</version>
</dependency>
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.7.2</version>
</dependency>
</dependencies>
<build>
<finalName>web-jersey</finalName>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<compilerArguments>
<source>1.7</source>
<target>1.7</target>
<encoding>UTF-8</encoding>
</compilerArguments>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
<plugin>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-maven-plugin</artifactId>
<version>9.2.2.v20140723</version>
<configuration>
<httpConnector>
<port>8080</port>
</httpConnector>
<!-- 在很短的时间间隔内在扫描web应用检查是否有改变,如果发觉有任何改变则自动热部署。默认为0,表示禁用热部署检查。任何一个大于0的数字都将表示启用。 -->
<scanIntervalSeconds>10</scanIntervalSeconds>
<webAppConfig>
<!--jetty插件启动后的访问路径: http://localhost:8080/testdemo-->
<contextPath>/</contextPath>
<tempDirectory>${project.build.directory}/work
</tempDirectory>
</webAppConfig>
</configuration>
</plugin>
</plugins>
</build>
</project>
日本著名网站dmm,这是个好东西,楼主的东西更好,感谢楼主 谢谢分享想问下楼主有办法上PasteBin吗? 收藏了,怕楼主删了! 看了这么多帖子,第一次看到这么经典的! 谢谢楼主分享,大力支持 hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh http://pan点baidu.com/s/1eSPqZ2y
页:
[1]