usando websocket para raspagem de dados

votos
0

Eu quero raspar alguns dos dados a partir daqui que é implementado com base em websockets. Então, depois de inspecionar os DevTools Chrome para endereço wss e cabeçalho:

digite

ea mensagem de negociação:

digite

Eu escrevi:

from websocket import create_connection

headers = {
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-US,en;q=0.9,fa;q=0.8',
    'Cache-Control': 'no-cache',
    'Connection': 'Upgrade',
    'Host': 'stream179.forexpros.com',
    'Origin': 'https://www.investing.com',
    'Pragma': 'no-cache',
    'Sec-WebSocket-Extensions': 'client_max_window_bits',
    'Sec-WebSocket-Key': 'ldcvnZNquzPkSNvpSdI09g==',
    'Sec-WebSocket-Version': '13',
    'Upgrade': 'websocket',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36'
}

ws = create_connection('wss://stream179.forexpros.com/echo/894/l27e2ja8/websocket', header=headers)

nego_message = '''[{\_event\:\bulk-subscribe\,\tzID\:8,\message\:\pid-1:%%pid-8839:%%pid-166:%%pid-20:%%pid-169:%%pid-170:%%pid-44336:%%pid-27:%%pid-172:%%pid-2:%%pid-3:%%pid-5:%%pid-7:%%pid-9:%%pid-10:%%pid-945629:%%pid-11:%%pid-16:%%pid-68:%%pidTechSumm-1:%%pidTechSumm-2:%%pidTechSumm-3:%%pidTechSumm-5:%%pidTechSumm-7:%%pidTechSumm-9:%%pidTechSumm-10:%%pidExt-1:%%event-393634:%%event-393633:%%event-393636:%%event-393638:%%event-394479:%%event-394518:%%event-394514:%%event-394516:%%event-394515:%%event-394517:%%event-393654:%%event-394467:%%event-393653:%%event-394468:%%event-394545:%%event-394549:%%event-394548:%%event-394547:%%event-394550:%%event-394546:%%event-394551:%%event-394553:%%event-394552:%%event-394743:%%event-394744:%%event-393661:%%event-394469:%%event-394470:%%event-393680:%%event-393682:%%event-393681:%%event-393687:%%event-393694:%%event-393685:%%event-393689:%%event-393688:%%event-393695:%%event-393698:%%event-393704:%%event-393705:%%event-393724:%%event-393723:%%event-393725:%%event-393726:%%event-394591:%%event-393736:%%event-393733:%%event-393734:%%event-393740:%%event-393731:%%event-393732:%%event-393730:%%event-394617:%%event-394616:%%event-393737:%%event-378304:%%event-393645:%%event-394619:%%event-393755:%%event-393757:%%event-393760:%%event-393756:%%event-393758:%%event-393759:%%event-393761:%%event-393762:%%event-394481:%%event-394625:%%event-393754:%%event-394483:%%event-393775:%%event-394621:%%event-394622:%%event-376710:%%event-394623:%%event-394484:%%event-394624:%%isOpenExch-1:%%isOpenExch-2:%%isOpenExch-13:%%isOpenExch-3:%%isOpenExch-4:%%isOpenPair-1:%%isOpenPair-8839:%%isOpenPair-44336:%%cmt-1-5-1:%%domain-1:\}]'''

ws.send(nego_message)

while True:
    print(ws.recv())

mas eu estou recebendo:

o

Traceback (most recent call last):
  File test.py, line 647, in <module>
    print(ws.recv())
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_core.py, line 313, in recv
    opcode, data = self.recv_data()
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_core.py, line 330, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_core.py, line 343, in recv_data_frame
    frame = self.recv_frame()
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_core.py, line 377, in recv_frame
    return self.frame_buffer.recv_frame()
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_abnf.py, line 361, in recv_frame
    self.recv_header()
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_abnf.py, line 309, in recv_header
    header = self.recv_strict(2)
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_abnf.py, line 396, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_core.py, line 452, in _recv
    return recv(self.sock, bufsize)
  File C:\Users\me\AppData\Local\Programs\Python\Python37\lib\site-packages\websocket\_socket.py, line 115, in recv
    Connection is already closed.)
websocket._exceptions.WebSocketConnectionClosedException: Connection is already closed.
[Finished in 1.9s]

O que estou perdendo aqui?

Publicado 10/10/2019 em 00:45
fonte usuário
Em outras línguas...                            


1 respostas

votos
0

O whilecircuito está chamando ws.recv()duas vezes. Se você simplesmente fazer:

print(ws.recv())

Não vai tentar chamar .recv()em uma conexão fechada. O resultado de sua saída mensagem é a impressão oantes do rastreamento de pilha.

Como um aparte, parece que você pode querer uma conexão já em execução usando websocket.WebSocketApp( exemplo ) para uma raspagem.

Respondeu 10/10/2019 em 05:24
fonte usuário

Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more