Coder Social home page Coder Social logo

cay-zhang / swiftspeech Goto Github PK

View Code? Open in Web Editor NEW
434.0 434.0 52.0 8.24 MB

A speech recognition framework designed for SwiftUI.

License: MIT License

Swift 98.70% Ruby 1.30%
audio combine ios speech-recognition swift swiftui user-voice voice-recognition

swiftspeech's Introduction

Hi there ๐Ÿ‘‹

Cay's GitHub Stats

  • ๐Ÿ”ญ Developer of RSSBud and SwiftSpeech
  • ๐Ÿ’ฌ Ask me about SwiftUI
  • ๐Ÿ“ซ Reach me on Telegram
  • ๐Ÿ˜„ Pronouns: He/Him

Support me

Donate with WeChat Donate with Alipay

swiftspeech's People

Contributors

cay-zhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swiftspeech's Issues

Possible volume conflict between SwiftSpeech and AVAudioPlayer?

My app involves both SwiftSpeech's features and sound effects through SwiftySound. All sound effects work fine in the simulator, but on the device all sound effects stop working once the SwiftSpeech button is pressed. I have one button that makes a "click" noise when pressed. It works until I press the SwiftSpeech button.

If I press the SwiftSpeech button first, I get a sound effect the first time, but then not for subsequent presses.

I made a new, simple project without SwiftSpeech just to test the sound, and everything worked fine on the device. I also switched out SwiftySound and used the normal AVAudioPlayer procedure, and the sound works that way, too.

So the only thing I can think of is that there is a conflict between SwiftSpeech and the sound effects. Is it possible that SwiftSpeech is turning off my sound effects? If so, how do I turn them back on? My code appears below:

`import SwiftUI
import SwiftSpeech
import SwiftySound
import AVFoundation
import AudioToolbox

// VIEW MODEL
struct ContentView: View {

let emojiArray = ["๐Ÿต","๐Ÿฆ","๐Ÿถ","๐Ÿบ","๐ŸฆŠ","๐Ÿฆ","๐Ÿฑ","๐Ÿฆ","๐Ÿ…","๐Ÿด","๐Ÿฆ“","๐ŸฆŒ","๐Ÿฎ","๐Ÿท","๐Ÿ","๐Ÿช","๐Ÿฆ™","๐Ÿฆ’","๐Ÿ˜","๐Ÿฆ","๐Ÿฆ›","๐Ÿ","๐Ÿ€","๐Ÿฐ","๐Ÿฆ‡","๐Ÿป","๐Ÿจ","๐Ÿผ","๐Ÿฆ˜","๐Ÿฆƒ","๐Ÿ”","๐Ÿง","๐Ÿฆ…","๐Ÿฆ†","๐Ÿฆข","๐Ÿฆ‰","๐Ÿฆš","๐Ÿธ","๐ŸŠ","๐Ÿข","๐ŸฆŽ","๐Ÿ","๐Ÿณ","๐Ÿฌ","๐ŸŸ","๐Ÿ™","๐ŸŒ","๐Ÿฆ‹","๐Ÿœ","๐Ÿ","๐Ÿž","๐Ÿฆ—","๐Ÿ•ท","๐Ÿฆ‚","๐ŸฆŸ"]

@State private var emoji = ""
@State private var nextEmoji = ""
@State private var text = "What is this? (Press and Hold)"
@State private var theDescription = ""
@State var isCorrect:Bool
@State var player = AVAudioPlayer()

var body: some View {
    
    ZStack(alignment:.top) {
    VStack(alignment: .center) {
        
        Text (emoji).font(.system(size: 200, weight: .bold, design: .default))
            .onAppear() {
                emoji = emojiArray.randomElement() ?? "none"
               
                theDescription = emoji.applyingTransform(.toUnicodeName, reverse: false) ?? "None"
                print (theDescription) // get the emoji's unicode name
            }

    Text (text)
            .onAppear {
                SwiftSpeech.requestSpeechRecognitionAuthorization()
            }
      .padding()
        
        SwiftSpeech.RecordButton()
        }
            .swiftSpeechRecordOnHold()
            .onRecognize { _, result in
                text = result.bestTranscription.formattedString
                print (text)
                self.text = text
                if theDescription.contains(self.text.uppercased()) == true {
                    
                    print ("That's right")
                    text = "That's right!"
                    isCorrect = true
                    playRightSound() 
                 
                }
                else {print ("That's wrong")
                    text = "Try again!"
                    isCorrect = false
                    playWrongSound()
                }
                    
            } handleError: { _, _ in }
    
    Spacer()
        
    Button("Change Animal") {
        nextEmoji = emojiArray.randomElement() ?? "none"
        while nextEmoji == emoji {
            
            nextEmoji = emojiArray.randomElement() ?? "none"
            
        }
        
        playClickSound()
        emoji = nextEmoji
        text = "What is this? (Press and hold)"
        theDescription = emoji.applyingTransform(.toUnicodeName, reverse: false) ?? "None"
        print (theDescription)
            }
  
        }
        
    }

func playRightSound(){

print ("Playing right sound")

Sound.play(file:"yay.wav")

       }

 func playWrongSound() {
    
 Sound.play(file:"raspberry.wav")
    
}

func playClickSound() {
    
    Sound.play(file:"click.wav")
   
}

struct ContentView_Previews: PreviewProvider {
static var previews: some View {
    
    ContentView(isCorrect:true)
    
}

}

}

`

Graceful notification of microphone activation failure

Hi!

I've encountered an exception due to a bug in iOS 14 beta 4 + AirPods that breaks (on software level, hopefully) mic on AirPods. In system apps (iMessage, Recorder ...) the issue prevents voice recording/recognition from working but it does not crash an app. In case of SwiftSpeech, the app crashes with uncaught exception.

Is is possible to catch such a failure and gracefully pass notification with the error? Or at least prevent the crash.

The log message is:
Terminating app due to uncaught exception 'com.apple.coreaudio.avfaudio', reason: 'required condition is false: IsFormatSampleRateAndChannelCountValid(format)'

Automatic stop of recording after some seconds of silence

Hello โœŒ๏ธ Thank you for such a wonderful library!

In my app I wanted to implement something similar to dictation button in Safari search:

  1. User taps button
  2. User speaks
  3. When user is not speaking for about 2 seconds, dictation stops automatically
RPReplay_Final1698318345.MP4

This way user doesn't need to tap on a button again to stop dictation. There are built-in methods swiftSpeechRecordOnHold and swiftSpeechToggleRecordingOnTap, but both of them need additional interaction from user. Also there was a need for different button.

Here is how I solved this, maybe this will be helpful for somebody in the future. Will be happy to hear any comments on how this can be done better:


import SwiftUI
import SwiftSpeech

// Creating new extension with custom record button view
public extension SwiftSpeech {
    struct RecordButtonCustom: View {
        public var body: some View {
            RecordButtonView()
        }
    }
}

// Define new EnvironmentKey for custom state
struct DictationState: EnvironmentKey {
    static let defaultValue: SwiftSpeech.State = .pending
}

// Define new Environment Values for custom state
extension EnvironmentValues {
    var dictationState: SwiftSpeech.State {
        get {
            self[DictationState.self]
        }
        set {
            self[DictationState.self] = newValue
        }
    }
}

struct SwiftSpeechView: View {
    @State private var text = "Tap to Speak"
    @State private var timer: Timer?
    @State var dictationState: SwiftSpeech.State = .pending
    
    var body: some View {
        VStack() {

            Text(text)
            
            SwiftSpeech
                .RecordButtonCustom()
                .swiftSpeechToggleRecordingOnTap(locale: Locale(identifier: "en_US"))
                .onRecognizeLatest(
                    includePartialResults: true,
                    handleResult: { session, result in
                        text = result.bestTranscription.formattedString
                        
                        timer?.invalidate()
                        // initiate timer to stop recording after 2 seconds of silence
                        timer = Timer.scheduledTimer(withTimeInterval: 2.0, repeats: false) { timer in
                            session.stopRecording()
                            dictationState = .pending
                        }
                    },
                    handleError: { session, error in
                        text = "Error \((error as NSError).code)"
                    
                        session.stopRecording()
                        dictationState = .pending
                })
                .onStartRecording { session in
                    dictationState = .recording
                }
                .onStopRecording { session in
                    dictationState = .pending
                }
                .onCancelRecording{ session in
                    dictationState = .cancelling
                }

        }
        .onAppear {
            SwiftSpeech.requestSpeechRecognitionAuthorization()
        }
        .environment(\.dictationState, dictationState)
    }
}

#Preview {
    SwiftSpeechView()
}
import SwiftUI
import SwiftSpeech

struct RecordButtonView: View {
    
    @Environment(\.dictationState) var state: SwiftSpeech.State
    
    public init() { }
    
    var icon: String {
        switch state {
        case .pending:
            return "mic"
        case .recording:
            return "mic.fill"
        case .cancelling:
            return "xmark"
        }
    }
    
    public var body: some View {
        Button("Dictate", systemImage: icon, action: {
            print("Dictate")
        })
        .buttonStyle(.borderless)
        .labelStyle(.iconOnly)
        .help("Dictate")
    }
    
}

#Preview {
    RecordButtonView()
}

My stack:
Xcode 15.1 beta
visionOS 1.0

IOS 16 beta 2

Worked fine up to iOS 15.6. I tested it on IOS 16 beta 2 on iPhone 12 Pro Max and I always get Thread 1: Fatal error : recordingSession is nil in EndRecording() in the following function when I release the speech button.

fileprivate func endRecording() {
guard let session = recordingSession else { preconditionFailure("recordingSession is nil in (#function)") }
recordingSession?.stopRecording()
delegate.onStopRecording(session: session)
self.viewComponentState = .pending
self.recordingSession = nil
}

install issues

when i put this install url๏ผŒi got errorโ€œUnable to find a specification for 'SwiftSpeech'.โ€

would you please tell me how can i install it through cocoapods๏ผŸ

SwiftSpeech and Other Languages

I know from the examles that SwiftSpeech can handle all supported languages. But I don't see how to implement this functionality. I gather from the example that I must add like this for Hebrew:

public init(locale: Locale = .autoupdatingCurrent) { self.locale = locale } public init(localeIdentifier: String) { self.locale = Locale(identifier: "he-IL") }

but I don't understand how to use this setting in SwiftSpeech:

`Text (text)
.onAppear {
SwiftSpeech.requestSpeechRecognitionAuthorization()
}

         SwiftSpeech.RecordButton()
                .swiftSpeechRecordOnHold(sessionConfiguration: .init(audioSessionConfiguration:     .playAndRecord))
            .onRecognize { _, result in
                
                text = result.bestTranscription.formattedString
               
                self.text = text
                if text == word {                 // word from array, checking pronunciation
                    
                    playRightSound()
                            }

                else {
                   
                    playWrongSound()
                
                        }
                    
                        } handleError: { _, _ in }
            
                }
        
        }
    
      }`

I appreciate your help!

Swift Playgrounds Compatibility

@Cay-Zhang
Adding this to Swift Playgrounds on iPadOS results in an error message: โ€œpackage doesn't have version tagsโ€. I fixed that in my fork by adding a new tag removing the โ€œvโ€œ prefix.

You can also have a look at this:
erikdoe/ocmock#496

Unable to use with TextToSpeech

Firstly, the library is awesome.

But I bumped an issue when trying to use this with the text to speech.

Simply you can also reproduce the issue with the following code;

import AVFoundation
func onSpeechToTextEnded() {
   let utterance = AVSpeechUtterance(string: "Hello world")
   utterance.voice = AVSpeechSynthesisVoice(language: "en-GB") 

   let synthesizer = AVSpeechSynthesizer()
   synthesizer.speak(utterance)
}

if I try to call this function (onSpeechToTextEnded) before actually using this library, I can hear the voice.
But when I try calling this function to hear some voices it is now working.

Can you investigate the issue please

Speech Recognition string does not match a (hard coded) string

I assign the speech recognition string to a @State var speechRecogText and I check if the other hardcoded string private var text contains the string from the speech recognition. This works properly and prints contains... in English. But with the Arabic Language, it does not work.

@State var speechRecogText: String = ""
private var text: String = "ู‚ู„"

if textFieldText.contains(speechRecogText) {
            print("contaions voice text")
        } else {
            print("doesn't contain voice text")
        }

Console
// doesn't contain voice text

However when I try to swap the variables like this:

if speechRecogText.contains(textFieldText) {
            print("contains text")
        } else {
            print("doesnt contain text")
        }

Console
// contains text

What might be the reason for this? Does it have to do anything with the Language or how Strings actually behave?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.