01
视频教程
02
课程介绍
在前几个课程中,我们分别为大家介绍了:
启发灵感:从想法到实践
搭建界面的基础:Gradio 框架基础
使用模型的流程:从魔搭开源模型到模型 API 服务
前后端联调及应用发布:如何把应用完整串联并部署创空间
接下来,我们会在这节课程中让你把之前几节课学到的内容融会贯通,通过单词卡的案例来介绍一个更具实用性的完整 AI 应用的实现。
03
效果展示
大家在学习过程中可能会有各种办法来帮忙记英语单词,比如也会使用单词卡,通过例句、画面等来帮助自己记忆一些单词,这次我们就来开发一个帮忙生成专属单词的 AI 应用。
创空间体验:
Notebook:
https://modelscope.cn/notebook/share/ipynb/12ab6058/word_memory_cards.ipynb
04
IDEA
希望生成一个功能全面、内容丰富的单词卡,需要下面的几个步骤。
05
实现过程
1. 跑通模型验证
生成单词信息 => 大语言模型:Inference API
try:
url = "https://api-inference.modelscope.cn/v1/images/generations"
payload = {
"model": "MusePublic/489_ckpt_FLUX_1", # ModelScope Model-Id, required
"prompt": "一只长颈鹿", # prompt, required
}
headers = {
"Authorization": f"Bearer {MODELSCOPE_ACCESS_TOKEN}",# provide your modelscope sdk token
"Content-Type": "application/json"
}
response = requests.post(url, data=json.dumps(payload, ensure_ascii=False), headers=headers)
response_data = response.json()
image_url = response_data['images'][0]['url']
except Exception as e:
print(e)
生成单词封面图 => 文生图模型:Inference API or SwingDeploy or AIGC 结合
try:
url = "https://api-inference.modelscope.cn/v1/images/generations"
payload = {
"model": "MusePublic/489_ckpt_FLUX_1", # ModelScope Model-Id, required
"prompt": "一只长颈鹿", # prompt, required
}
headers = {
"Authorization": f"Bearer {MODELSCOPE_ACCESS_TOKEN}",# provide your modelscope sdk token
"Content-Type": "application/json"
}
response = requests.post(url, data=json.dumps(payload, ensure_ascii=False), headers=headers)
response_data = response.json()
image_url = response_data['images'][0]['url']
except Exception as e:
print(e)
生成例句语音 => 语音生成模型:SDK pipeline
audio_model_id = 'iic/speech_sambert-hifigan_tts_zh-cn_16k'
sambert_hifigan_tts = pipeline(task=Tasks.text_to_speech, model=audio_model_id)
output = sambert_hifigan_tts(input="The giraffe stretched its long neck to reach the leaves at the treetop.", voice='zhitian_emo')
wav = output[OutputKeys.OUTPUT_WAV]
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"{timestamp}.wav"
file_path = os.path.join(directory_path, filename)
with open(file_path, 'wb') as f:
f.write(wav)
生成界面卡片 => Coder 模型:Inference API
GenerateUiCodeSystemPrompt = """
你是一个网页开发工程师,根据下面的指示编写网页。
所有代码写在一个代码块中,形成一个完整的代码文件进行展示,不用将HTML代码和JavaScript代码分开。
**你更倾向集成并输出这类完整的可运行代码,而非拆分成若干个代码块输出**。
对于部分类型的代码能够在UI窗口渲染图形界面,生成之后请你再检查一遍代码运行,确保输出无误。
仅输出 html,不要附加任何描述文案。
"""
GenerateUiCodePromptTemplate = """
创建一个HTML页面,用于介绍英语单词 $title 的词源、含义及用法。页面应该包括以下部分:
- **标题区域**:分成三行展示单词“$title”及其音标 $phonetic_symbols 和基本含义:$translation_meaning。字体颜色为深绿色。
- **词源解释区域**:$etymological_explanation。字体颜色为深灰色。
- **图片带链接区域**:包含一张与单词相关的图片,图片链接为 $image_url。
- **例句区域**:提供一个使用单词 $title 的例句:“$example_sentence” 其中“$title”一词被高亮显示为深绿色,其他颜色为深灰色。英文显示。
- **播放例句**:提供一个音频播放按钮,按钮上有文字“播放例句”,点击后播放一段音频,音频链接为:$audio_url。
请确保页面布局美观,易于阅读,所有的内容居中对齐,不限制页面高度,背景颜色为浅色,边框颜色为深绿色。
"""
template = Template(GenerateUiCodePromptTemplate)
prompt = template.substitute(infos)
print('generate_ui_code:', prompt)
messages = [
{'role': 'system', 'content': GenerateUiCodeSystemPrompt },
{'role': 'user', 'content': prompt},
]
display_messages = display_messages + messages
try:
gen = client.chat.completions.create(
model="Qwen/Qwen2.5-Coder-32B-Instruct",
messages=messages,
stream=True
)
full_response = ""
display_messages.append({'role': 'assistant', 'content': full_response})
for chunk in gen:
content = chunk.choices[0].delta.content
full_response += content
display_messages[-1]['content'] = full_response
is_stop = chunk.choices[0].finish_reason == 'stop'
yield {
"display_messages": display_messages,
"content": full_response,
"is_stop": is_stop,
}
except Exception as e:
yield {
"display_messages": display_messages,
"content": str(e),
"is_stop": True,
}
2. 串联流程开发
async def generate_media(infos):
return await asyncio.gather(
generate_audio(infos['example_sentence']),
generate_image(infos['example_sentence_image_prompt'])
)
def run_flow(query, request: gr.Request):
display_messages = []
yield {
steps: gr.update(current=0),
drawer: gr.update(open=True),
}
for info_result in generate_word_info(query, display_messages):
if info_result['is_stop']:
word_info_str = info_result['content']
break
else:
yield {
display_chatbot: covert_display_messages(info_result['display_messages']),
}
infos = json.loads(word_info_str)
yield {
steps: gr.update(current=1),
display_chatbot: covert_display_messages(info_result['display_messages']),
}
display_messages.append({
'role': 'assistant',
'content': f"根据这些内容生成插图和例句发音:\n 插图:{infos['example_sentence_image_prompt']}\n 例句发音:{infos['example_sentence']}",
})
yield {
display_chatbot: covert_display_messages(display_messages),
}
generate_results = asyncio.run(generate_media(infos))
root = get_root_url(
request=request, route_path="/gradio_api/queue/join", root_path=demo.root_path
)
root = root.replace("http:", "https:")
print('root:', root)
infos['audio_url'] = f"{root}/gradio_api/file={demo.move_resource_to_block_cache(generate_results[0])}"
infos['image_url'] = generate_results[1]
yield {
steps: gr.update(current=2),
}
for ui_code_result in generate_ui_code(infos, display_messages):
if ui_code_result['is_stop']:
ui_code_str = ui_code_result['content']
break
else:
yield {
display_chatbot: covert_display_messages(ui_code_result['display_messages']),
}
yield {
drawer: gr.update(open=False),
display_chatbot: covert_display_messages(ui_code_result['display_messages']),
sandbox_output: send_to_sandbox(remove_code_block(ui_code_str)),
}
3. 设计交互界面和实现 UI
with gr.Blocks(css=css) as demo:
history = gr.State([])
with ms.Application():
with antd.ConfigProvider(locale="zh_CN"):
with antd.Row(gutter=[32, 12]) as layout:
with antd.Col(span=24, md=8):
with antd.Flex(vertical=True, gap="middle", wrap=True):
header = gr.HTML("""
<div class="left_header">
<img src="//img.alicdn.com/imgextra/i3/O1CN01Q501Kf1IjzMXFpjvL_!!6000000000930-0-tps-768-1024.jpg" width="200px" />
<h2>随心单词卡</h2>
</div>
""")
input = antd.InputTextarea(
size="large", allow_clear=True, placeholder="请输入你想要记什么单词?")
btn = antd.Button("生成", type="primary", size="large")
antd.Divider("示例")
with antd.Flex(gap="small", wrap=True):
with ms.Each(DEMO_LIST):
with antd.Card(hoverable=True, as_item="card") as demoCard:
antd.CardMeta()
demoCard.click(demo_card_click, outputs=[input])
antd.Divider("设置")
view_process_btn = antd.Button("查看生成过程")
with antd.Col(span=24, md=16):
with ms.Div(elem_classes="right_panel"):
with antd.Drawer(open=False, width="1200", title="生成过程") as drawer:
with ms.Div(elem_classes="step_container"):
with antd.Steps(0) as steps:
antd.Steps.Item(title="容我查一下词典", description="正在生成单词的各类信息")
antd.Steps.Item(title="容我补些素材", description="正在生成单词例句的助记图和发音")
antd.Steps.Item(title="即将大功告成", description="正在生成单词卡的界面")
display_chatbot = gr.Chatbot(type="messages", elem_classes="display_chatbot", height=800, show_label=False, )
sandbox_output = gr.HTML("""
<div align="center">
<h4>在左侧输入或选择你想要的单词卡开始制作吧~</h4>
</div>
""")
4. 完成完整开发和验证优化
06
进阶作业和扩展课题
进阶作业
下面内容供大家尝试去修改和实践
扩展课题
点击阅读原文,即可跳转课程合集~
?点击关注ModelScope公众号获取
更多技术信息~