Skip to content
Toggle navigation
Toggle navigation
This project
Loading...
Sign in
文鑫
/
guduo_spider
Go to a project
Toggle navigation
Toggle navigation pinning
Projects
Groups
Snippets
Help
Project
Activity
Repository
Pipelines
Graphs
Issues
0
Merge Requests
0
Wiki
Network
Create a new issue
Builds
Commits
Issue Boards
Files
Commits
Network
Compare
Branches
Tags
Commit
31217236
...
31217236888a3f4656fa6028316bdadd507f4b61
authored
2024-12-23 10:17:36 +0800
by
wenxin
Browse Files
Options
Browse Files
Tag
Download
Email Patches
Plain Diff
爬取结果异常处理
1 parent
2f5ad695
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
8 additions
and
5 deletions
app/service/spider_job_service.py
app/spider/guduo_spider.py
app/service/spider_job_service.py
View file @
3121723
...
...
@@ -69,9 +69,13 @@ def get_job_info(taskId: int):
async
def
scrawl_and_save
(
taskParam
:
SpiderParams
):
# 执行爬虫获取结果
results
=
await
startBrowser
(
taskParam
)
logger
.
info
(
f
"爬虫重试情况:{startBrowser.statistics}"
)
try
:
# 执行爬虫获取结果 给下面一行代码添加 try cache try 捕获异常
results
=
await
startBrowser
(
taskParam
)
except
Exception
as
e
:
logger
.
info
(
f
"爬虫重试情况:{startBrowser.statistics}"
)
logger
.
error
(
f
"爬虫任务执行失败,失败原因:{e}"
)
return
asyncTasks
=
(
save_or_update
(
item
)
for
item
in
results
)
await
asyncio
.
gather
(
*
asyncTasks
)
logger
.
info
(
f
"爬虫任务执行完成,爬取到数据{len(results)}条 保存到数据库完成"
)
...
...
app/spider/guduo_spider.py
View file @
3121723
...
...
@@ -12,8 +12,7 @@ from tenacity import (
before_sleep_log
,
retry
,
stop_after_attempt
,
wait_exponential
,
wait_fixed
,
wait_exponential
)
logger
=
logging
.
getLogger
(
__name__
)
...
...
Please
register
or
sign in
to post a comment